Re: [Python-Dev] Cython for cPickle?

2012-04-22 Thread Alexandre Vassalotti
On Thu, Apr 19, 2012 at 4:55 AM, Stefan Behnel stefan...@behnel.de wrote:

 That sounds like less than two weeks of work, maybe even if we add the
 marshal module to it.
 In less than a month of GSoC time, this could easily reach a point where
 it's close to the speed of what we have and fast enough, but a lot more
 accessible and maintainable, thus also making it easier to add the
 extensions described in the PEP.

 What do you think?


As others have pointed out, many users of pickle depend on its performance.
The main reason why _pickle.c is so big is all the low-level optimizations
we have in there. We have custom stack and dictionary implementations just
for the sake of speed. We also have fast paths for I/O operations and
function calls. These optimizations alone are taking easily 2000 lines of
code and they are not micro-optimizations. Each of these were shown to give
speedups from one to several orders of magnitude.

So I disagree that we could easily reach the point where it's close to the
speed of what we have. And if we were to attempt this, it would be a
multiple months undertaking. I would rather see that time spent on
improving pickle than on yet another reimplementation.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Cython for cPickle?

2012-04-22 Thread martin

So I disagree that we could easily reach the point where it's close to the
speed of what we have. And if we were to attempt this, it would be a
multiple months undertaking. I would rather see that time spent on
improving pickle than on yet another reimplementation.


Of course, this being free software, anybody can spend time on whatever they
please, and this should not make anybody feel sad. You just don't get merits
if you work on stuff that nobody cares about.

Regards,
Martin


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Cython for cPickle?

2012-04-22 Thread Greg Ewing

Alexandre Vassalotti wrote:


We have custom stack and 
dictionary implementations just for the sake of speed. We also have fast 
paths for I/O operations and function calls.


All of that could very likely be carried over almost
unchanged into a Cython version. I don't see why it
should take multiple months. It's not a matter of
rewriting it from scratch, just translating it from
one dialect (C) to another (the C subset of Cython).

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Cython for cPickle?

2012-04-22 Thread Alexandre Vassalotti
On Sun, Apr 22, 2012 at 6:12 PM, mar...@v.loewis.de wrote:

  So I disagree that we could easily reach the point where it's close to
 the
 speed of what we have. And if we were to attempt this, it would be a
 multiple months undertaking. I would rather see that time spent on
 improving pickle than on yet another reimplementation.


 Of course, this being free software, anybody can spend time on whatever
 they
 please, and this should not make anybody feel sad. You just don't get
 merits
 if you work on stuff that nobody cares about.


Yes, of course. I don't want to discourage anyone to investigate this
option—in fact, I would very much like to see myself proven wrong. But, if
I understood Stefan correctly, he is proposing to have a GSoC student to do
the work, to which I would feel uneasy about since we have no idea how
valuable this would be as a contribution.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Cython for cPickle?

2012-04-22 Thread Nick Coghlan
On Mon, Apr 23, 2012 at 9:27 AM, Alexandre Vassalotti
alexan...@peadrop.com wrote:
 On Sun, Apr 22, 2012 at 6:12 PM, mar...@v.loewis.de wrote:
 Of course, this being free software, anybody can spend time on whatever
 they
 please, and this should not make anybody feel sad. You just don't get
 merits
 if you work on stuff that nobody cares about.


 Yes, of course. I don't want to discourage anyone to investigate this
 option—in fact, I would very much like to see myself proven wrong. But, if I
 understood Stefan correctly, he is proposing to have a GSoC student to do
 the work, to which I would feel uneasy about since we have no idea how
 valuable this would be as a contribution.

So long as it's made clear to the students applying that it's a proof
of concept that may return a negative result (i.e. it was tried, it
proved to be a bad idea) I don't see a problem with it. The freedom
to try out multiple ideas in parallel is one of the great strengths of
open source.

We've had GSoC students try unsuccessful experiments in the past and
have gained useful information as a result (e.g. the main reason I
know the Import Engine API proposed in the deferred PEP 406 isn't
adequate as currently written is because of the design level problems
Greg found when implementing it last summer. The currently documented
design simply doesn't achieve the full objectives of the PEP)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Cython for cPickle?

2012-04-22 Thread Guido van Rossum
On Sun, Apr 22, 2012 at 6:34 PM, Nick Coghlan ncogh...@gmail.com wrote:
 On Mon, Apr 23, 2012 at 9:27 AM, Alexandre Vassalotti
 alexan...@peadrop.com wrote:
 On Sun, Apr 22, 2012 at 6:12 PM, mar...@v.loewis.de wrote:
 Of course, this being free software, anybody can spend time on whatever
 they
 please, and this should not make anybody feel sad. You just don't get
 merits
 if you work on stuff that nobody cares about.


 Yes, of course. I don't want to discourage anyone to investigate this
 option—in fact, I would very much like to see myself proven wrong. But, if I
 understood Stefan correctly, he is proposing to have a GSoC student to do
 the work, to which I would feel uneasy about since we have no idea how
 valuable this would be as a contribution.

 So long as it's made clear to the students applying that it's a proof
 of concept that may return a negative result (i.e. it was tried, it
 proved to be a bad idea) I don't see a problem with it. The freedom
 to try out multiple ideas in parallel is one of the great strengths of
 open source.

 We've had GSoC students try unsuccessful experiments in the past and
 have gained useful information as a result (e.g. the main reason I
 know the Import Engine API proposed in the deferred PEP 406 isn't
 adequate as currently written is because of the design level problems
 Greg found when implementing it last summer. The currently documented
 design simply doesn't achieve the full objectives of the PEP)

However, I think that in this case the success may be predetermined,
or at least not determined by technical success alone. I have a lot of
respect for Cython, but I don't think it is right to have any part of
core Python depend on it. Cython is an incredibly complex and
relatively young (and still fast evolving) piece of technology, while
I think that core dependencies should be minimized and limited to
absolutely fundamental building blocks.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Cython for cPickle?

2012-04-19 Thread Martin v. Löwis
 What do you think?

I think I know what Jim Fulton thinks (as we talked about something
like this a PyCon): don't. He is already sad that cPickle grew so much
pickle features when it was designed as a real fast implementation.
pickle speed is really important to some users, and any loss of
performance needs serious justification. Easier maintenance is not
a sufficient reason.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Cython for cPickle?

2012-04-19 Thread Nick Coghlan
On Thu, Apr 19, 2012 at 6:55 PM, Stefan Behnel stefan...@behnel.de wrote:
 What do you think?

I think the possible use of Cython for standard library extension
modules is potentially worth looking into for the 3.4 timeframe (c.f.
the recent multiple checkins sorting out the refcounts for the new
ImportError helper function). There are obviously a lot of factors to
consider before actually proceeding with such an approach (even for
the extension modules), but a side-by-side comparison of pickle.py,
the existing C accelerated pickle module and a Cython accelerated
pickle module (including benchmark numbers) would be a valuable data
point in any such discussion.

However, it would definitely have to be pitched to any interested
students as a proof-of-concept exercise, with a real possibility that
the outcome will end up supporting MvL's reply.

Regards,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Cython for cPickle?

2012-04-19 Thread Antoine Pitrou
On Thu, 19 Apr 2012 10:55:24 +0200
Stefan Behnel stefan...@behnel.de wrote:
 
 I noticed that there is a PEP (3154) and a GSoC proposal about improving
 Pickle. Given the recent discussion on this list about using Cython for the
 import module, I wonder if it wouldn't make even more sense to switch from
 a C (accelerator) implementation to Cython for _pickle.

I think that's quite orthogonal to PEP 3154 (which shouldn't add a lot
of new code IMHO).

 Note that the approach won't be as simple as compiling pickle.py. _pickle
 uses a lot of optimisations that only work at the C level, at least
 efficiently. So the idea would be to rewrite _pickle in Cython instead.
 It's currently about 6500 lines of C. Even if we divide that only by a
 rather conservative factor of 3, we'd end up with some 2000 lines of Cython
 code, all extracted straight from the existing C code. That sounds like
 less than two weeks of work, maybe even if we add the marshal module to it.

I think this all needs someone to demonstrate the benefits, in
terms of both readability/maintainability, and performance.

Also, while C is a low-level language, Cython is a different language
than Python when you start using its optimization features. This means
core developers have to learn that language.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Cython for cPickle?

2012-04-19 Thread Brian Curtin
On Thu, Apr 19, 2012 at 05:38, Nick Coghlan ncogh...@gmail.com wrote:
 On Thu, Apr 19, 2012 at 6:55 PM, Stefan Behnel stefan...@behnel.de wrote:
 What do you think?

 I think the possible use of Cython for standard library extension
 modules is potentially worth looking into for the 3.4 timeframe (c.f.
 the recent multiple checkins sorting out the refcounts for the new
 ImportError helper function).

I'd rather just rtfm as was suggested and get it right than switch
everything around to Cython.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Cython for cPickle?

2012-04-19 Thread R. David Murray
On Thu, 19 Apr 2012 14:44:06 +0200, Antoine Pitrou solip...@pitrou.net wrote:
 Also, while C is a low-level language, Cython is a different language
 than Python when you start using its optimization features. This means
 core developers have to learn that language.

Hmm.  On the other hand, perhaps some core developers (present or
future) would prefer to learn Cython over learning C [*].

--David

[*] For this you may actually want to read learning to modify the Python
C codebase, since in fact I know how to program in C, I just prefer to
do as little of it as possible, and so haven't really learned the Python
C codebase.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Cython for cPickle?

2012-04-19 Thread Matt Joiner
Personally I find the unholy product of C and Python that is Cython to be
more complex than the sum of the complexities of its parts. Is it really
wise to be learning Cython without already knowing C, Python, and the
CPython object model?

While code generation alleviates the burden of tedious languages, it's also
infinitely more complex, makes debugging very difficult and adds to
prerequisite knowledge, among other drawbacks.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Cython for cPickle?

2012-04-19 Thread Stefan Behnel
Matt Joiner, 19.04.2012 16:13:
 Personally I find the unholy product of C and Python that is Cython to be
 more complex than the sum of the complexities of its parts. Is it really
 wise to be learning Cython without already knowing C, Python, and the
 CPython object model?

The main obstacle that I regularly see for users of the C-API is actually
reference counting and an understanding of what borrowed references and
owned references imply in a given code context. In fact, I can't remember
seeing any C extension code getting posted on Python mailing lists (core
developers excluded) that has no ref-counting bugs or at least a severe
lack of error handling. Usually, such code is also accompanied by a comment
that the author is not sure if everything is correct and asks for advice,
and that's rather independent of the functional complexity of the code
snippet. OTOH, I've also seen a couple of really dangerous code snippets
already that posters apparently meant to show off with, so not everyone is
aware of these obstacles.

Also, the C code by inexperienced programmers tends to be fairly
inefficient because they simply do not know what impact some convenience
functions have. So they tend to optimise prematurely in places where they
feel more comfortable, but that can never make up for the overhead that
simple and very conveniently looking C-API functions introduce in other
places. Value packing comes to mind.

So, from my experience, there is a serious learning curve beyond knowing C,
right from the start when trying to work on C extensions, including
CPython's own code, because the C-API is far from trivial.

And that's the kind of learning curve that Cython tries to lower. It makes
it substantially easier to write correct code, simply by letting you write
Python code instead of C plus C-API code. And once it works, you can start
making it explicitly faster by applying I know what I'm doing schemes to
proven hot spots or by partially rewriting it. And if you do not know yet
what you're doing, then *that's* where the learning curve begins. But by
then, your code is basically written, works more or less and can be
benchmarked.


 While code generation alleviates the burden of tedious languages, it's also
 infinitely more complex, makes debugging very difficult and adds to
 prerequisite knowledge, among other drawbacks.

You can use gdb for source level debugging of Cython code and cProfile to
profile it. Try that with C-API code.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Cython for cPickle?

2012-04-19 Thread Brian Curtin
On Thu, Apr 19, 2012 at 16:08, Stefan Behnel
 While code generation alleviates the burden of tedious languages, it's also
 infinitely more complex, makes debugging very difficult and adds to
 prerequisite knowledge, among other drawbacks.

 You can use gdb for source level debugging of Cython code and cProfile to
 profile it. Try that with C-API code.

I know I'm in the minority of committers being on Windows, but we do
receive a good amount of reports and contributions from Windows users
who dive into the C code. The outside contributors actually gave the
strongest indication that we needed to move to VS2010.

Visual Studio by itself makes debugging unbelievably easy, and with
the Python Tools for VS plugin it even allows Visual Studio's built-in
profiler to work. I know Windows is not on most people's maps, but if
we have to scrap the debugger, that's another learning curve
attachment to evaluate.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Cython for cPickle?

2012-04-19 Thread Stefan Behnel
Brian Curtin, 19.04.2012 23:19:
 On Thu, Apr 19, 2012 at 16:08, Stefan Behnel
 While code generation alleviates the burden of tedious languages, it's also
 infinitely more complex, makes debugging very difficult and adds to
 prerequisite knowledge, among other drawbacks.

 You can use gdb for source level debugging of Cython code and cProfile to
 profile it. Try that with C-API code.
 
 I know I'm in the minority of committers being on Windows, but we do
 receive a good amount of reports and contributions from Windows users
 who dive into the C code.

Doesn't match my experience at all - different software target audiences, I
guess.


 Visual Studio by itself makes debugging unbelievably easy, and with
 the Python Tools for VS plugin it even allows Visual Studio's built-in
 profiler to work. I know Windows is not on most people's maps, but if
 we have to scrap the debugger, that's another learning curve
 attachment to evaluate.

What I meant was that there's pdb for debugging Python code (which doesn't
know about the C code it executes) and gdb (or VS) for debugging C code,
from which you can barely infer the Python code it executes. For Cython
code, you can use gdb for both Cython and C, and within limits also for
Python code. Here's a quick intro to see what I mean:

http://docs.cython.org/src/userguide/debugging.html

For profiling, you can use cProfile for Python code (which doesn't tell you
about the C code it executes) and oprofile, callgrind, etc. (incl. VS) for
C code, from which it's non-trivial to infer the relation to the Python
code. With Cython, you can use cProfile for both Cython and Python code as
long as you stay at the source code level, and only need to descend to a
low-level profiler when you care about the exact details, usually assembly
jumps and branches.

Anyway, I guess this is getting off-topic for this list.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Cython for cPickle?

2012-04-19 Thread Brian Curtin
On Thu, Apr 19, 2012 at 17:21, Stefan Behnel stefan...@behnel.de wrote:
 Brian Curtin, 19.04.2012 23:19:
 On Thu, Apr 19, 2012 at 16:08, Stefan Behnel
 While code generation alleviates the burden of tedious languages, it's also
 infinitely more complex, makes debugging very difficult and adds to
 prerequisite knowledge, among other drawbacks.

 You can use gdb for source level debugging of Cython code and cProfile to
 profile it. Try that with C-API code.

 I know I'm in the minority of committers being on Windows, but we do
 receive a good amount of reports and contributions from Windows users
 who dive into the C code.

 Doesn't match my experience at all - different software target audiences, I
 guess.

I'm don't know what this means. I work on CPython, which is the target
audience at hand, and I come across reports and contributions from
Windows users for C extensions.

 Visual Studio by itself makes debugging unbelievably easy, and with
 the Python Tools for VS plugin it even allows Visual Studio's built-in
 profiler to work. I know Windows is not on most people's maps, but if
 we have to scrap the debugger, that's another learning curve
 attachment to evaluate.

 What I meant was that there's pdb for debugging Python code (which doesn't
 know about the C code it executes) and gdb (or VS) for debugging C code,
 from which you can barely infer the Python code it executes. For Cython
 code, you can use gdb for both Cython and C, and within limits also for
 Python code. Here's a quick intro to see what I mean:

 http://docs.cython.org/src/userguide/debugging.html

I know what you meant. What I meant is easy debugging on Windows goes
away, now I have to setup and learn GDB on Windows. *I* can do that.
Does the rest of the community want to have to do that as well? We
should also take into consideration how something like this affects
the third-party IDEs and their debugger support.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com