Re: [Python-Dev] Opcode cache in ceval loop

2016-02-05 Thread Sven R. Kunze
On 05.02.2016 00:06, Matthias Bussonnier wrote: On Feb 4, 2016, at 08:22, Sven R. Kunze wrote: On 04.02.2016 16:57, Matthias Bussonnier wrote: On Feb 3, 2016, at 13:22, Yury Selivanov wrote: An ideal way would be to calculate a hit/miss ratio over

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-04 Thread Matthias Bussonnier
> On Feb 4, 2016, at 08:22, Sven R. Kunze wrote: > > On 04.02.2016 16:57, Matthias Bussonnier wrote: >>> On Feb 3, 2016, at 13:22, Yury Selivanov wrote: >>> >>> >>> An ideal way would be to calculate a hit/miss ratio over time >>> for each cached

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-04 Thread Nick Coghlan
On 3 February 2016 at 06:49, Stephen J. Turnbull wrote: > Yury Selivanov writes: > > > Not sure about that... PEPs take a LOT of time :( > > Informational PEPs need not take so much time, no more than you would > spend on ceval.txt. I'm sure a PEP would get a lot more

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-04 Thread Stephen J. Turnbull
Nick Coghlan writes: > If someone else wanted to also describe the change in a PEP for ease > of future reference, using Yury's ceval.txt as input, I do think that > would be a good thing, but I wouldn't want to make the enhancement > conditional on someone volunteering to do that. I wasn't

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-04 Thread Matthias Bussonnier
> On Feb 3, 2016, at 13:22, Yury Selivanov wrote: > > > An ideal way would be to calculate a hit/miss ratio over time > for each cached opcode, but that would be an expensive > calculation. Do you mean like a sliding windows ? Otherwise if you just want a let's say

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-04 Thread Sven R. Kunze
On 04.02.2016 16:57, Matthias Bussonnier wrote: On Feb 3, 2016, at 13:22, Yury Selivanov wrote: An ideal way would be to calculate a hit/miss ratio over time for each cached opcode, but that would be an expensive calculation. Do you mean like a sliding windows ?

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-03 Thread francismb
Hi, On 02/01/2016 10:43 PM, Yury Selivanov wrote: > > We also need to deoptimize the code to avoid having too many cache > misses/pointless cache updates. I found that, for instance, LOAD_ATTR > is either super stable (hits 100% of times), or really unstable, so 20 > misses is, again, seems to

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-03 Thread Sven R. Kunze
On 03.02.2016 22:22, Yury Selivanov wrote: One way of tackling this is to give each optimized opcode a counter for hit/misses. When we have a "hit" we increment that counter, when it's a miss, we decrement it. Within a given range, I suppose. Like: c = min(c+1, 100) I kind of have

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-03 Thread Yury Selivanov
On 2016-02-03 3:53 PM, francismb wrote: Hi, On 02/01/2016 10:43 PM, Yury Selivanov wrote: We also need to deoptimize the code to avoid having too many cache misses/pointless cache updates. I found that, for instance, LOAD_ATTR is either super stable (hits 100% of times), or really

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-02 Thread Yury Selivanov
On 2016-02-02 12:41 PM, Serhiy Storchaka wrote: On 01.02.16 21:10, Yury Selivanov wrote: To measure the max/average memory impact, I tuned my code to optimize *every* code object on *first* run. Then I ran the entire Python test suite. Python test suite + standard library both contain

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-02 Thread Yury Selivanov
Hi Victor, On 2016-02-02 4:33 AM, Victor Stinner wrote: Hi, Maybe it's worth to write a PEP to summarize all your changes to optimize CPython? It would avoid to have to follow different threads on the mailing lists, different issues on the bug tracker, with external links to GitHub gists, etc.

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-02 Thread Sven R. Kunze
On 02.02.2016 20:41, Yury Selivanov wrote: Hi Victor, On 2016-02-02 4:33 AM, Victor Stinner wrote: Hi, Maybe it's worth to write a PEP to summarize all your changes to optimize CPython? It would avoid to have to follow different threads on the mailing lists, different issues on the bug

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-02 Thread Yury Selivanov
On 2016-02-02 1:45 PM, Serhiy Storchaka wrote: On 02.02.16 19:45, Yury Selivanov wrote: On 2016-02-02 12:41 PM, Serhiy Storchaka wrote: On 01.02.16 21:10, Yury Selivanov wrote: To measure the max/average memory impact, I tuned my code to optimize *every* code object on *first* run. Then I

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-02 Thread Victor Stinner
2016-02-02 20:23 GMT+01:00 Yury Selivanov : > Alright, I modified the code to optimize ALL code objects, and ran unit > tests with the above tests excluded: > > -- Max process mem (ru_maxrss) = 131858432 > -- Opcode cache number of objects = 42109 > -- Opcode cache

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-02 Thread Serhiy Storchaka
On 02.02.16 21:23, Yury Selivanov wrote: Alright, I modified the code to optimize ALL code objects, and ran unit tests with the above tests excluded: -- Max process mem (ru_maxrss) = 131858432 -- Opcode cache number of objects = 42109 -- Opcode cache total extra mem= 10901106 Thank

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-02 Thread ƦOB COASTN
>> I can write a ceval.txt file explaining what's going on >> in ceval loop, with details on the opcode cache and other >> things. I think it's even better than a PEP, to be honest. > > > I totally agree. > Please include the notes text file. This provides an excellent summary for those of us

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-02 Thread Stephen J. Turnbull
Yury Selivanov writes: > Not sure about that... PEPs take a LOT of time :( Informational PEPs need not take so much time, no more than you would spend on ceval.txt. I'm sure a PEP would get a lot more attention from reviewers, too. Even if you PEP the whole thing, as you say it's a (big ;-)

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-02 Thread Serhiy Storchaka
On 02.02.16 21:41, Yury Selivanov wrote: I can write a ceval.txt file explaining what's going on in ceval loop, with details on the opcode cache and other things. I think it's even better than a PEP, to be honest. I totally agree. ___ Python-Dev

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-02 Thread Serhiy Storchaka
On 02.02.16 19:45, Yury Selivanov wrote: On 2016-02-02 12:41 PM, Serhiy Storchaka wrote: On 01.02.16 21:10, Yury Selivanov wrote: To measure the max/average memory impact, I tuned my code to optimize *every* code object on *first* run. Then I ran the entire Python test suite. Python test

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-02 Thread Victor Stinner
Hi, Maybe it's worth to write a PEP to summarize all your changes to optimize CPython? It would avoid to have to follow different threads on the mailing lists, different issues on the bug tracker, with external links to GitHub gists, etc. Your code changes critical parts of Python: code object

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-02 Thread Serhiy Storchaka
On 01.02.16 21:10, Yury Selivanov wrote: To measure the max/average memory impact, I tuned my code to optimize *every* code object on *first* run. Then I ran the entire Python test suite. Python test suite + standard library both contain around 72395 code objects, which required 20Mb of memory

[Python-Dev] Opcode cache in ceval loop

2016-02-01 Thread Yury Selivanov
Hi, This is the second email thread I start regarding implementing an opcode cache in ceval loop. Since my first post on this topic: - I've implemented another optimization (LOAD_ATTR); - I've added detailed statistics mode so that I can "see" how the cache performs and tune it; - some

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-01 Thread Sven R. Kunze
On 01.02.2016 20:51, Yury Selivanov wrote: If LOAD_ATTR gets too many cache misses (20 in my current patch) it gets deoptimized, and the default implementation is used. So if the code is very dynamic - there's no improvement, but no performance penalty either. Will you re-try optimizing it?

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-01 Thread Yury Selivanov
Hi Damien, On 2016-02-01 3:59 PM, Damien George wrote: Hi Yury, That's great news about the speed improvements with the dict offset cache! The cache struct is defined in code.h [2], and is 32 bytes long. When a code object becomes hot, it gets an cache offset table allocated for it (+1 byte

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-01 Thread Sven R. Kunze
On 01.02.2016 21:35, Yury Selivanov wrote: It's important to understand that if we have a lot of cache misses after the code object was executed 1000 times, it doesn't make sense to keep trying to update that cache. It just means that the code, in that particular point, works with different

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-01 Thread Andrew Barnert via Python-Dev
Looking over the thread and the two issues, you've got good arguments for why the improved code will be the most common code, and good benchmarks for various kinds of real-life code, but it doesn't seem like you'd tried to stress it on anything that could be made worse. From your explanations

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-01 Thread Brett Cannon
On Mon, 1 Feb 2016 at 12:16 Yury Selivanov wrote: > Brett, > > On 2016-02-01 3:08 PM, Brett Cannon wrote: > > > > > > On Mon, 1 Feb 2016 at 11:51 Yury Selivanov > > wrote: > > > > Hi Brett, > > > [..] > > > >

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-01 Thread Brett Cannon
On Mon, 1 Feb 2016 at 11:11 Yury Selivanov wrote: > Hi, > > This is the second email thread I start regarding implementing an opcode > cache in ceval loop. Since my first post on this topic: > > - I've implemented another optimization (LOAD_ATTR); > > - I've added

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-01 Thread Yury Selivanov
On 2016-02-01 4:02 PM, Sven R. Kunze wrote: On 01.02.2016 21:35, Yury Selivanov wrote: It's important to understand that if we have a lot of cache misses after the code object was executed 1000 times, it doesn't make sense to keep trying to update that cache. It just means that the code, in

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-01 Thread Damien George
Hi Yury, That's great news about the speed improvements with the dict offset cache! > The cache struct is defined in code.h [2], and is 32 bytes long. When a > code object becomes hot, it gets an cache offset table allocated for it > (+1 byte for each opcode) + an array of cache structs. Ok, so

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-01 Thread Yury Selivanov
Sven, On 2016-02-01 4:32 PM, Sven R. Kunze wrote: On 01.02.2016 22:27, Yury Selivanov wrote: Right now they are private constants in ceval.c. I will (maybe) expose a private API via the _testcapi module to re-define them (set them to 1 or 0), only to write better unittests. I have no plans

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-01 Thread Yury Selivanov
On 2016-02-01 4:21 PM, Yury Selivanov wrote: Hi Damien, On 2016-02-01 3:59 PM, Damien George wrote: [..] But then how do you index the cache, do you keep a count of the current opcode number? If I remember correctly, CPython has some opcodes taking 1 byte, and some taking 3 bytes, so the

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-01 Thread Sven R. Kunze
On 01.02.2016 22:27, Yury Selivanov wrote: Right now they are private constants in ceval.c. I will (maybe) expose a private API via the _testcapi module to re-define them (set them to 1 or 0), only to write better unittests. I have no plans to make those constants public or have a public API

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-01 Thread Brett Cannon
On Mon, 1 Feb 2016 at 11:51 Yury Selivanov wrote: > Hi Brett, > > On 2016-02-01 2:30 PM, Brett Cannon wrote: > > > > > > On Mon, 1 Feb 2016 at 11:11 Yury Selivanov > > wrote: > > > > Hi, > > > [..] > > > >

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-01 Thread Yury Selivanov
Brett, On 2016-02-01 3:08 PM, Brett Cannon wrote: On Mon, 1 Feb 2016 at 11:51 Yury Selivanov > wrote: Hi Brett, [..] The first two fields are used to make sure that we have objects of the same type. If it changes,

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-01 Thread Yury Selivanov
Hi Brett, On 2016-02-01 2:30 PM, Brett Cannon wrote: On Mon, 1 Feb 2016 at 11:11 Yury Selivanov > wrote: Hi, [..] What's next? First, I'd like to merge the new LOAD_METHOD opcode, see issue 26110

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-01 Thread Yury Selivanov
On 2016-02-01 3:21 PM, Brett Cannon wrote: On Mon, 1 Feb 2016 at 12:16 Yury Selivanov > wrote: Brett, On 2016-02-01 3:08 PM, Brett Cannon wrote: > > > On Mon, 1 Feb 2016 at 11:51 Yury Selivanov

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-01 Thread Yury Selivanov
On 2016-02-01 3:27 PM, Sven R. Kunze wrote: On 01.02.2016 20:51, Yury Selivanov wrote: If LOAD_ATTR gets too many cache misses (20 in my current patch) it gets deoptimized, and the default implementation is used. So if the code is very dynamic - there's no improvement, but no performance

Re: [Python-Dev] Opcode cache in ceval loop

2016-02-01 Thread Yury Selivanov
Andrew, On 2016-02-01 4:29 PM, Andrew Barnert wrote: Looking over the thread and the two issues, you've got good arguments for why the improved code will be the most common code, and good benchmarks for various kinds of real-life code, but it doesn't seem like you'd tried to stress it on