Re: [Python-Dev] Cython for cPickle?
On Thu, Apr 19, 2012 at 4:55 AM, Stefan Behnel stefan...@behnel.de wrote: That sounds like less than two weeks of work, maybe even if we add the marshal module to it. In less than a month of GSoC time, this could easily reach a point where it's close to the speed of what we have and fast enough, but a lot more accessible and maintainable, thus also making it easier to add the extensions described in the PEP. What do you think? As others have pointed out, many users of pickle depend on its performance. The main reason why _pickle.c is so big is all the low-level optimizations we have in there. We have custom stack and dictionary implementations just for the sake of speed. We also have fast paths for I/O operations and function calls. These optimizations alone are taking easily 2000 lines of code and they are not micro-optimizations. Each of these were shown to give speedups from one to several orders of magnitude. So I disagree that we could easily reach the point where it's close to the speed of what we have. And if we were to attempt this, it would be a multiple months undertaking. I would rather see that time spent on improving pickle than on yet another reimplementation. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Cython for cPickle?
So I disagree that we could easily reach the point where it's close to the speed of what we have. And if we were to attempt this, it would be a multiple months undertaking. I would rather see that time spent on improving pickle than on yet another reimplementation. Of course, this being free software, anybody can spend time on whatever they please, and this should not make anybody feel sad. You just don't get merits if you work on stuff that nobody cares about. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Cython for cPickle?
Alexandre Vassalotti wrote: We have custom stack and dictionary implementations just for the sake of speed. We also have fast paths for I/O operations and function calls. All of that could very likely be carried over almost unchanged into a Cython version. I don't see why it should take multiple months. It's not a matter of rewriting it from scratch, just translating it from one dialect (C) to another (the C subset of Cython). -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Cython for cPickle?
On Sun, Apr 22, 2012 at 6:12 PM, mar...@v.loewis.de wrote: So I disagree that we could easily reach the point where it's close to the speed of what we have. And if we were to attempt this, it would be a multiple months undertaking. I would rather see that time spent on improving pickle than on yet another reimplementation. Of course, this being free software, anybody can spend time on whatever they please, and this should not make anybody feel sad. You just don't get merits if you work on stuff that nobody cares about. Yes, of course. I don't want to discourage anyone to investigate this option—in fact, I would very much like to see myself proven wrong. But, if I understood Stefan correctly, he is proposing to have a GSoC student to do the work, to which I would feel uneasy about since we have no idea how valuable this would be as a contribution. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Cython for cPickle?
On Mon, Apr 23, 2012 at 9:27 AM, Alexandre Vassalotti alexan...@peadrop.com wrote: On Sun, Apr 22, 2012 at 6:12 PM, mar...@v.loewis.de wrote: Of course, this being free software, anybody can spend time on whatever they please, and this should not make anybody feel sad. You just don't get merits if you work on stuff that nobody cares about. Yes, of course. I don't want to discourage anyone to investigate this option—in fact, I would very much like to see myself proven wrong. But, if I understood Stefan correctly, he is proposing to have a GSoC student to do the work, to which I would feel uneasy about since we have no idea how valuable this would be as a contribution. So long as it's made clear to the students applying that it's a proof of concept that may return a negative result (i.e. it was tried, it proved to be a bad idea) I don't see a problem with it. The freedom to try out multiple ideas in parallel is one of the great strengths of open source. We've had GSoC students try unsuccessful experiments in the past and have gained useful information as a result (e.g. the main reason I know the Import Engine API proposed in the deferred PEP 406 isn't adequate as currently written is because of the design level problems Greg found when implementing it last summer. The currently documented design simply doesn't achieve the full objectives of the PEP) Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Cython for cPickle?
On Sun, Apr 22, 2012 at 6:34 PM, Nick Coghlan ncogh...@gmail.com wrote: On Mon, Apr 23, 2012 at 9:27 AM, Alexandre Vassalotti alexan...@peadrop.com wrote: On Sun, Apr 22, 2012 at 6:12 PM, mar...@v.loewis.de wrote: Of course, this being free software, anybody can spend time on whatever they please, and this should not make anybody feel sad. You just don't get merits if you work on stuff that nobody cares about. Yes, of course. I don't want to discourage anyone to investigate this option—in fact, I would very much like to see myself proven wrong. But, if I understood Stefan correctly, he is proposing to have a GSoC student to do the work, to which I would feel uneasy about since we have no idea how valuable this would be as a contribution. So long as it's made clear to the students applying that it's a proof of concept that may return a negative result (i.e. it was tried, it proved to be a bad idea) I don't see a problem with it. The freedom to try out multiple ideas in parallel is one of the great strengths of open source. We've had GSoC students try unsuccessful experiments in the past and have gained useful information as a result (e.g. the main reason I know the Import Engine API proposed in the deferred PEP 406 isn't adequate as currently written is because of the design level problems Greg found when implementing it last summer. The currently documented design simply doesn't achieve the full objectives of the PEP) However, I think that in this case the success may be predetermined, or at least not determined by technical success alone. I have a lot of respect for Cython, but I don't think it is right to have any part of core Python depend on it. Cython is an incredibly complex and relatively young (and still fast evolving) piece of technology, while I think that core dependencies should be minimized and limited to absolutely fundamental building blocks. -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Cython for cPickle?
What do you think? I think I know what Jim Fulton thinks (as we talked about something like this a PyCon): don't. He is already sad that cPickle grew so much pickle features when it was designed as a real fast implementation. pickle speed is really important to some users, and any loss of performance needs serious justification. Easier maintenance is not a sufficient reason. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Cython for cPickle?
On Thu, Apr 19, 2012 at 6:55 PM, Stefan Behnel stefan...@behnel.de wrote: What do you think? I think the possible use of Cython for standard library extension modules is potentially worth looking into for the 3.4 timeframe (c.f. the recent multiple checkins sorting out the refcounts for the new ImportError helper function). There are obviously a lot of factors to consider before actually proceeding with such an approach (even for the extension modules), but a side-by-side comparison of pickle.py, the existing C accelerated pickle module and a Cython accelerated pickle module (including benchmark numbers) would be a valuable data point in any such discussion. However, it would definitely have to be pitched to any interested students as a proof-of-concept exercise, with a real possibility that the outcome will end up supporting MvL's reply. Regards, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Cython for cPickle?
On Thu, 19 Apr 2012 10:55:24 +0200 Stefan Behnel stefan...@behnel.de wrote: I noticed that there is a PEP (3154) and a GSoC proposal about improving Pickle. Given the recent discussion on this list about using Cython for the import module, I wonder if it wouldn't make even more sense to switch from a C (accelerator) implementation to Cython for _pickle. I think that's quite orthogonal to PEP 3154 (which shouldn't add a lot of new code IMHO). Note that the approach won't be as simple as compiling pickle.py. _pickle uses a lot of optimisations that only work at the C level, at least efficiently. So the idea would be to rewrite _pickle in Cython instead. It's currently about 6500 lines of C. Even if we divide that only by a rather conservative factor of 3, we'd end up with some 2000 lines of Cython code, all extracted straight from the existing C code. That sounds like less than two weeks of work, maybe even if we add the marshal module to it. I think this all needs someone to demonstrate the benefits, in terms of both readability/maintainability, and performance. Also, while C is a low-level language, Cython is a different language than Python when you start using its optimization features. This means core developers have to learn that language. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Cython for cPickle?
On Thu, Apr 19, 2012 at 05:38, Nick Coghlan ncogh...@gmail.com wrote: On Thu, Apr 19, 2012 at 6:55 PM, Stefan Behnel stefan...@behnel.de wrote: What do you think? I think the possible use of Cython for standard library extension modules is potentially worth looking into for the 3.4 timeframe (c.f. the recent multiple checkins sorting out the refcounts for the new ImportError helper function). I'd rather just rtfm as was suggested and get it right than switch everything around to Cython. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Cython for cPickle?
On Thu, 19 Apr 2012 14:44:06 +0200, Antoine Pitrou solip...@pitrou.net wrote: Also, while C is a low-level language, Cython is a different language than Python when you start using its optimization features. This means core developers have to learn that language. Hmm. On the other hand, perhaps some core developers (present or future) would prefer to learn Cython over learning C [*]. --David [*] For this you may actually want to read learning to modify the Python C codebase, since in fact I know how to program in C, I just prefer to do as little of it as possible, and so haven't really learned the Python C codebase. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Cython for cPickle?
Personally I find the unholy product of C and Python that is Cython to be more complex than the sum of the complexities of its parts. Is it really wise to be learning Cython without already knowing C, Python, and the CPython object model? While code generation alleviates the burden of tedious languages, it's also infinitely more complex, makes debugging very difficult and adds to prerequisite knowledge, among other drawbacks. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Cython for cPickle?
Matt Joiner, 19.04.2012 16:13: Personally I find the unholy product of C and Python that is Cython to be more complex than the sum of the complexities of its parts. Is it really wise to be learning Cython without already knowing C, Python, and the CPython object model? The main obstacle that I regularly see for users of the C-API is actually reference counting and an understanding of what borrowed references and owned references imply in a given code context. In fact, I can't remember seeing any C extension code getting posted on Python mailing lists (core developers excluded) that has no ref-counting bugs or at least a severe lack of error handling. Usually, such code is also accompanied by a comment that the author is not sure if everything is correct and asks for advice, and that's rather independent of the functional complexity of the code snippet. OTOH, I've also seen a couple of really dangerous code snippets already that posters apparently meant to show off with, so not everyone is aware of these obstacles. Also, the C code by inexperienced programmers tends to be fairly inefficient because they simply do not know what impact some convenience functions have. So they tend to optimise prematurely in places where they feel more comfortable, but that can never make up for the overhead that simple and very conveniently looking C-API functions introduce in other places. Value packing comes to mind. So, from my experience, there is a serious learning curve beyond knowing C, right from the start when trying to work on C extensions, including CPython's own code, because the C-API is far from trivial. And that's the kind of learning curve that Cython tries to lower. It makes it substantially easier to write correct code, simply by letting you write Python code instead of C plus C-API code. And once it works, you can start making it explicitly faster by applying I know what I'm doing schemes to proven hot spots or by partially rewriting it. And if you do not know yet what you're doing, then *that's* where the learning curve begins. But by then, your code is basically written, works more or less and can be benchmarked. While code generation alleviates the burden of tedious languages, it's also infinitely more complex, makes debugging very difficult and adds to prerequisite knowledge, among other drawbacks. You can use gdb for source level debugging of Cython code and cProfile to profile it. Try that with C-API code. Stefan ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Cython for cPickle?
On Thu, Apr 19, 2012 at 16:08, Stefan Behnel While code generation alleviates the burden of tedious languages, it's also infinitely more complex, makes debugging very difficult and adds to prerequisite knowledge, among other drawbacks. You can use gdb for source level debugging of Cython code and cProfile to profile it. Try that with C-API code. I know I'm in the minority of committers being on Windows, but we do receive a good amount of reports and contributions from Windows users who dive into the C code. The outside contributors actually gave the strongest indication that we needed to move to VS2010. Visual Studio by itself makes debugging unbelievably easy, and with the Python Tools for VS plugin it even allows Visual Studio's built-in profiler to work. I know Windows is not on most people's maps, but if we have to scrap the debugger, that's another learning curve attachment to evaluate. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Cython for cPickle?
Brian Curtin, 19.04.2012 23:19: On Thu, Apr 19, 2012 at 16:08, Stefan Behnel While code generation alleviates the burden of tedious languages, it's also infinitely more complex, makes debugging very difficult and adds to prerequisite knowledge, among other drawbacks. You can use gdb for source level debugging of Cython code and cProfile to profile it. Try that with C-API code. I know I'm in the minority of committers being on Windows, but we do receive a good amount of reports and contributions from Windows users who dive into the C code. Doesn't match my experience at all - different software target audiences, I guess. Visual Studio by itself makes debugging unbelievably easy, and with the Python Tools for VS plugin it even allows Visual Studio's built-in profiler to work. I know Windows is not on most people's maps, but if we have to scrap the debugger, that's another learning curve attachment to evaluate. What I meant was that there's pdb for debugging Python code (which doesn't know about the C code it executes) and gdb (or VS) for debugging C code, from which you can barely infer the Python code it executes. For Cython code, you can use gdb for both Cython and C, and within limits also for Python code. Here's a quick intro to see what I mean: http://docs.cython.org/src/userguide/debugging.html For profiling, you can use cProfile for Python code (which doesn't tell you about the C code it executes) and oprofile, callgrind, etc. (incl. VS) for C code, from which it's non-trivial to infer the relation to the Python code. With Cython, you can use cProfile for both Cython and Python code as long as you stay at the source code level, and only need to descend to a low-level profiler when you care about the exact details, usually assembly jumps and branches. Anyway, I guess this is getting off-topic for this list. Stefan ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Cython for cPickle?
On Thu, Apr 19, 2012 at 17:21, Stefan Behnel stefan...@behnel.de wrote: Brian Curtin, 19.04.2012 23:19: On Thu, Apr 19, 2012 at 16:08, Stefan Behnel While code generation alleviates the burden of tedious languages, it's also infinitely more complex, makes debugging very difficult and adds to prerequisite knowledge, among other drawbacks. You can use gdb for source level debugging of Cython code and cProfile to profile it. Try that with C-API code. I know I'm in the minority of committers being on Windows, but we do receive a good amount of reports and contributions from Windows users who dive into the C code. Doesn't match my experience at all - different software target audiences, I guess. I'm don't know what this means. I work on CPython, which is the target audience at hand, and I come across reports and contributions from Windows users for C extensions. Visual Studio by itself makes debugging unbelievably easy, and with the Python Tools for VS plugin it even allows Visual Studio's built-in profiler to work. I know Windows is not on most people's maps, but if we have to scrap the debugger, that's another learning curve attachment to evaluate. What I meant was that there's pdb for debugging Python code (which doesn't know about the C code it executes) and gdb (or VS) for debugging C code, from which you can barely infer the Python code it executes. For Cython code, you can use gdb for both Cython and C, and within limits also for Python code. Here's a quick intro to see what I mean: http://docs.cython.org/src/userguide/debugging.html I know what you meant. What I meant is easy debugging on Windows goes away, now I have to setup and learn GDB on Windows. *I* can do that. Does the rest of the community want to have to do that as well? We should also take into consideration how something like this affects the third-party IDEs and their debugger support. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com