Re: [Python-Dev] [ANN] VPython 0.1

2008-11-30 Thread Jeffrey Yasskin
Here's another data point. My results are similar to Skip's (unsurprising since I'm also using a mac). My wild guess is that the 30% vs 10% improvement is an AMD vs. Intel thing? It's not 32-bit vs. 64-bit since both David and Jakob got a 30% speedup, but David had a 32-bit build while Jakob had a

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-26 Thread Stefan Behnel
Greg Ewing wrote: A.M. Kuchling wrote: A stray thought: does using a generator for the VM make life easier for the Stackless Python developers in any way? Does it make it possible for stock CPython to become stackless? I doubt it. A major barrier to stacklessness is that a lot of

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-26 Thread Jakob Sievers
Phillip J. Eby [EMAIL PROTECTED] writes: At 10:47 AM 10/24/2008 +0200, J. Sievers wrote: - Right now, CPython's bytecode is translated to direct threaded code lazily (when a code object is first evaluated). This would have to be merged into compile.c in some way plus some assorted minor

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-26 Thread Greg Ewing
Stefan Behnel wrote: That's obviously a problem, but it only answers the second question, not the first one. [does using a generator for the VM make life easier for the Stackless Python developers in any way?] The Stackless Python developers themselves would have to answer that one, but my

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-25 Thread A.M. Kuchling
On Sat, Oct 25, 2008 at 04:33:23PM +1300, Greg Ewing wrote: Maybe not, but at least you can follow what it's doing just by knowing C. Introducing vmgen would introduce another layer for the reader to learn about. A stray thought: does using a generator for the VM make life easier for the

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-25 Thread Phillip J. Eby
At 07:50 AM 10/25/2008 -0400, A.M. Kuchling wrote: On Sat, Oct 25, 2008 at 04:33:23PM +1300, Greg Ewing wrote: Maybe not, but at least you can follow what it's doing just by knowing C. Introducing vmgen would introduce another layer for the reader to learn about. A stray thought: does using

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-25 Thread Greg Ewing
A.M. Kuchling wrote: A stray thought: does using a generator for the VM make life easier for the Stackless Python developers in any way? Does it make it possible for stock CPython to become stackless? I doubt it. A major barrier to stacklessness is that a lot of extension modules would need

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-24 Thread J. Sievers
M.-A. Lemburg [EMAIL PROTECTED] writes: [snip] BTW: I hope you did not use pybench to get profiles of the opcodes. That would most certainly result in good results for pybench, but less good ones for general applications such as Django or Zope/Plone. Algorithm used for superinstruction

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-24 Thread J. Sievers
Daniel Stutzbach [EMAIL PROTECTED] writes: [snip] I searched around for information on how threaded code interacts with branch prediction, and here's what I found. The short answer is that threaded code significantly improves branch prediction. See ``Optimizing indirect branch

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-24 Thread Ralf Schmitt
On Fri, Oct 24, 2008 at 7:18 AM, Terry Reedy [EMAIL PROTECTED] wrote: I have not seen any Windows test yet. The direct threading is gcc-specific, so there might be degradation with MSVC. erlang uses gcc to compile a single source file on windows and uses MS VC++ to compile all others. They

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-24 Thread J. Sievers
Greg Ewing [EMAIL PROTECTED] writes: Daniel Stutzbach wrote: With threaded code, every handler ends with its own dispatcher, so the processor can make fine-grained predictions. I'm still wondering whether all this stuff makes a noticeable difference in real-life Python code, which spends

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-24 Thread M.-A. Lemburg
On 2008-10-24 09:53, J. Sievers wrote: M.-A. Lemburg [EMAIL PROTECTED] writes: [snip] BTW: I hope you did not use pybench to get profiles of the opcodes. That would most certainly result in good results for pybench, but less good ones for general applications such as Django or Zope/Plone.

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-24 Thread J. Sievers
[EMAIL PROTECTED] writes: On 23 Oct, 10:42 pm, [EMAIL PROTECTED] wrote: Guido van Rossum wrote: there already is something else called VPython Perhaps it could be called Fython (Python with a Forth-like VM) or Thython (threaded-code Python). I feel like I've missed something important, but,

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-24 Thread Stefan Behnel
Greg Ewing wrote: [EMAIL PROTECTED] wrote: Is there any reason this should be a separate project rather than just be rolled in to the core? Always keep in mind that one of the important characteristics of CPython is that its implementation is very straightforward and easy to follow.

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-24 Thread skip
Guido This is very interesting (at this point I'm just lurking), but Guido has anyone pointed out yet that there already is something else Guido called VPython, which has a long standing right to the name? I believe Jakob has already been notified about this. How about TPython? A

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-24 Thread skip
Terry I have not seen any Windows test yet. The direct threading is Terry gcc-specific, so there might be degradation with MSVC. Not if a compiler #ifdef selects between two independent choices: #ifdef __GCC__ /* or whatever the right incantation is */ #include

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-24 Thread Phillip J. Eby
At 10:47 AM 10/24/2008 +0200, J. Sievers wrote: - Right now, CPython's bytecode is translated to direct threaded code lazily (when a code object is first evaluated). This would have to be merged into compile.c in some way plus some assorted minor changes. Don't you mean codeobject.c? I

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-24 Thread Jakob Sievers
[EMAIL PROTECTED] writes: BTW, as to the implementation of individual VM instructions I don't believe the Vmgen stuff affects that. It's just the way the instructions are assembled. Vmgen handles the pushing and popping as well. E.g. ROT_THREE becomes: rot_three ( a1 a2 a3 -- a3 a1 a2 )

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-24 Thread Jakob Sievers
[EMAIL PROTECTED] writes: Guido This is very interesting (at this point I'm just lurking), but Guido has anyone pointed out yet that there already is something else Guido called VPython, which has a long standing right to the name? I believe Jakob has already been notified about

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-24 Thread Greg Ewing
Stefan Behnel wrote: Funny to hear that from the author of a well-known code generator. ;-) I've never claimed that anything about the implementation of Pyrex is easy to follow. :-) Having two switch statements and a couple of separate special cases for a single eval loop might look pretty

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-23 Thread J. Sievers
Hey, I hope you don't mind my replying in digest form. First off, I guess I should be a little clearer as to what VPthon is and what it does. VPython is essentially a set of patches for CPython (in touches only three files, diff -b is about 800 lines IIRC plus the switch statement in ceval.c's

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-23 Thread Adam Olsen
On Thu, Oct 23, 2008 at 1:08 AM, J. Sievers [EMAIL PROTECTED] wrote: In particular, direct threaded code leads to less horrible branch prediction than switch dispatch on many machines (exactly how pronounced this effect is depends heavily on the specific architecture). To clarify: This is

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-23 Thread M.-A. Lemburg
On 2008-10-23 09:08, J. Sievers wrote: a) It's fairly easy to implement different types of dispatch, simply by changing a few macros (and while I haven't done this, it shouldn't be a problem to add some switch dispatch #ifdefs for non-GCC platforms). In particular, direct threaded code leads

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-23 Thread Greg Ewing
Adam Olsen wrote: To clarify: This is *NOT* actually a form of threading, is it? I think the term threaded code is being used here in the sense of Forth, i.e. instead of a sequence of small integers that are dispatched using a switch statement, you use the actual machine addresses of the

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-23 Thread A.M. Kuchling
On Thu, Oct 23, 2008 at 01:31:48AM -0600, Adam Olsen wrote: To clarify: This is *NOT* actually a form of threading, is it? It merely breaks the giant dispatch table into a series of small ones, while also grouping instructions into larger superinstructions? OS threads are not touched at any

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-23 Thread Antoine Pitrou
A.M. Kuchling amk at amk.ca writes: threaded code: A technique for implementing virtual machine interpreters, introduced by J.R. Bell in 1973, where each op-code in the virtual machine instruction set is the address of some (lower level) code to perform the required

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-23 Thread David Ripton
On 2008.10.23 12:02:12 +0200, M.-A. Lemburg wrote: BTW: I hope you did not use pybench to get profiles of the opcodes. That would most certainly result in good results for pybench, but less good ones for general applications such as Django or Zope/Plone. I was wondering about Pybench-specific

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-23 Thread M.-A. Lemburg
On 2008-10-23 15:19, David Ripton wrote: On 2008.10.23 12:02:12 +0200, M.-A. Lemburg wrote: BTW: I hope you did not use pybench to get profiles of the opcodes. That would most certainly result in good results for pybench, but less good ones for general applications such as Django or

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-23 Thread skip
Jakob David Gregg (and friends) recently published a paper comparing Jakob stack based and register based VMs for Java and found that Jakob register based VMs were substantially faster. The main reason for Jakob this appears to be the absence of the various LOAD_ instructions

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-23 Thread Daniel Stutzbach
On Thu, Oct 23, 2008 at 8:13 AM, Antoine Pitrou [EMAIL PROTECTED] wrote: Is this kind of optimization that useful on modern CPUs? It helps remove a memory access to the switch/case lookup table, which should shave off the 3 CPU cycles of latency of a modern L1 data cache, but it won't remove

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-23 Thread Greg Ewing
Daniel Stutzbach wrote: With threaded code, every handler ends with its own dispatcher, so the processor can make fine-grained predictions. I'm still wondering whether all this stuff makes a noticeable difference in real-life Python code, which spends most of its time doing expensive things

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-23 Thread Guido van Rossum
On Wed, Oct 22, 2008 at 5:16 AM, J. Sievers [EMAIL PROTECTED] wrote: I implemented a variant of the CPython VM on top of Gforth's Vmgen; this made it fairly straightforward to add direct threaded code and superinstructions for the various permutations of LOAD_CONST, LOAD_FAST, and most of the

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-23 Thread Greg Ewing
Guido van Rossum wrote: there already is something else called VPython Perhaps it could be called Fython (Python with a Forth-like VM) or Thython (threaded-code Python). -- Greg ___ Python-Dev mailing list Python-Dev@python.org

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-23 Thread glyph
On 23 Oct, 10:42 pm, [EMAIL PROTECTED] wrote: Guido van Rossum wrote: there already is something else called VPython Perhaps it could be called Fython (Python with a Forth-like VM) or Thython (threaded-code Python). I feel like I've missed something important, but, why not just call it

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-23 Thread Greg Ewing
[EMAIL PROTECTED] wrote: Is there any reason this should be a separate project rather than just be rolled in to the core? Always keep in mind that one of the important characteristics of CPython is that its implementation is very straightforward and easy to follow. Replacing the ceval loop

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-23 Thread Terry Reedy
[EMAIL PROTECTED] wrote: It's a substantial patch, but from what I understand it's a huge performance improvement and completely compatible, both at the C API and Python source levels. I have not seen any Windows test yet. The direct threading is gcc-specific, so there might be degradation

[Python-Dev] [ANN] VPython 0.1

2008-10-22 Thread J. Sievers
Hi, I implemented a variant of the CPython VM on top of Gforth's Vmgen; this made it fairly straightforward to add direct threaded code and superinstructions for the various permutations of LOAD_CONST, LOAD_FAST, and most of the two-argument VM instructions. Sources:

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-22 Thread Paul Moore
2008/10/22 J. Sievers [EMAIL PROTECTED]: I implemented a variant of the CPython VM on top of Gforth's Vmgen; this made it fairly straightforward to add direct threaded code and superinstructions for the various permutations of LOAD_CONST, LOAD_FAST, and most of the two-argument VM

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-22 Thread M.-A. Lemburg
On 2008-10-22 14:16, J. Sievers wrote: Hi, I implemented a variant of the CPython VM on top of Gforth's Vmgen; this made it fairly straightforward to add direct threaded code and superinstructions for the various permutations of LOAD_CONST, LOAD_FAST, and most of the two-argument VM

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-22 Thread Leonardo Santagada
On Oct 22, 2008, at 10:16 AM, J. Sievers wrote: Hi, I implemented a variant of the CPython VM on top of Gforth's Vmgen; this made it fairly straightforward to add direct threaded code and superinstructions for the various permutations of LOAD_CONST, LOAD_FAST, and most of the

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-22 Thread skip
J I implemented a variant of the CPython VM on top of Gforth's Vmgen; this made J it fairly straightforward to add direct threaded code and superinstructions for J the various permutations of LOAD_CONST, LOAD_FAST, and most of the two-argument J VM instructions. J Sources:

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-22 Thread skip
J I implemented a variant of the CPython VM on top of Gforth's Vmgen; J this made it fairly straightforward to add direct threaded code and J superinstructions for the various permutations of LOAD_CONST, J LOAD_FAST, and most of the two-argument VM instructions. Skip Trying to

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-22 Thread David Ripton
Feedback is, of course, very welcome and it'd be great to have some pybench results from different machines. My results are very similar to Jakob's. Gentoo Linux, 32-bit x86, Athlon 6400+ underclocked to 3.0 GHz. make test: 282 tests OK. 5 tests failed: test_doctest test_hotshot

Re: [Python-Dev] [ANN] VPython 0.1

2008-10-22 Thread Terry Reedy
David Ripton wrote: Feedback is, of course, very welcome and it'd be great to have some pybench results from different machines. My results are very similar to Jakob's. From looking thru the vmgen manual, there are two things it is doing that CPython is not. 1. gcc-specific threaded code;