Re: [Python-Dev] Possible performance regression

2019-02-26 Thread Stephen J. Turnbull
Raymond Hettinger writes:

 > We're trying to keep performant the ones that people actually use.
 > For the Mac, I think there are only four that matter:
 > 
 > 1) The one we distribute on the python.org 
 > website at 
 > https://www.python.org/ftp/python/3.8.0/python-3.8.0a2-macosx10.9.pkg
 > 
 > 2) The one installed by homebrew
 > 
 > 3) The way folks typically roll their own:
 >  $ ./configure && make   (or some variant of make install)
 > 
 > 4) The one shipped by Apple and put in /usr/bin

I don't see the relevance of (4) since we're talking about the
bleeding edge AFAICT.  Not clear about Homebrew -- since I've been
experimenting with it recently I use the bottled versions, which
aren't bleeding edge.

If prebuilt packages matter, I would add MacPorts (or substitute it
for (4) since nothing seems to get Apple's attention) and Anaconda
(which is what I recommend to my students).  But I haven't looked at
MacPorts' recent download stats, and maybe I'm just the odd one out.

Steve


-- 
Associate Professor  Division of Policy and Planning Science
http://turnbull.sk.tsukuba.ac.jp/ Faculty of Systems and Information
Email: turnb...@sk.tsukuba.ac.jp   University of Tsukuba
Tel: 029-853-5175 Tennodai 1-1-1, Tsukuba 305-8573 JAPAN
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Possible performance regression

2019-02-26 Thread Raymond Hettinger
On Feb 26, 2019, at 2:28 PM, Neil Schemenauer  wrote:
> 
> Are you compiling with --enable-optimizations (i.e. PGO)?  In my
> experience, that is needed to get meaningful results.

I'm not and I would worry that PGO would give less stable comparisons because 
it is highly sensitive to changes its training set as well as the actual 
CPython implementation (two moving targets instead of one).  That said, it 
doesn't really matter to the world how I build *my* Python.  We're trying to 
keep performant the ones that people actually use.  For the Mac, I think there 
are only four that matter:

1) The one we distribute on the python.org 
website at 
https://www.python.org/ftp/python/3.8.0/python-3.8.0a2-macosx10.9.pkg

2) The one installed by homebrew

3) The way folks typically roll their own:
$ ./configure && make   (or some variant of make install)

4) The one shipped by Apple and put in /usr/bin

Of the four, the ones I've been timing are #1 and #3.

I'm happy to drop this.  I was looking for independent confirmation and didn't 
get it. We can't move forward unless some else also observes a consistently 
measurable regression for a benchmark they care about on a build that they care 
about.  If I'm the only who notices then it really doesn't matter.  Also, it 
was reassuring to not see the same effect on a GCC-8 build.

Since the effect seems to be compiler specific, it may be that we knocked it 
out of a local minimum and that performance will return the next time someone 
touches the eval-loop.


Raymond  








___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Possible performance regression

2019-02-26 Thread Victor Stinner
Le mer. 27 févr. 2019 à 00:17, Victor Stinner  a écrit :
> My sad story with code placement:
> https://vstinner.github.io/analysis-python-performance-issue.html
>
> tl; dr Use PGO.

Hum wait, this article isn't complete. You have to see the follow-up:
https://bugs.python.org/issue28618#msg286662

"""
Victor: "FYI I wrote an article about this issue:
https://haypo.github.io/analysis-python-performance-issue.html Sadly,
it seems like I was just lucky when adding __attribute__((hot)) fixed
the issue, because call_method is slow again!"

I upgraded speed-python server (running benchmarks) to Ubuntu 16.04
LTS to support PGO compilation. I removed all old benchmark results
and ran again benchmarks with LTO+PGO. It seems like benchmark results
are much better now.

I'm not sure anymore that _Py_HOT_FUNCTION is really useful to get
stable benchmarks, but it may help code placement a little bit. I
don't think that it hurts, so I suggest to keep it. Since benchmarks
were still unstable with _Py_HOT_FUNCTION, I'm not interested to
continue to tag more functions with _Py_HOT_FUNCTION. I will now focus
on LTO+PGO for stable benchmarks, and ignore small performance
difference when PGO is not used.

I close this issue now.
"""

Now I recall that I tried hard to avoid PGO: the server used by
speed.python.org to run benchmarks didn't support PGO.

I fixed the issue by upgrading Ubuntu :-) Now speed.python.org uses
PGO. I stopped to stop to manually help the compiler with code
placement.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Possible performance regression

2019-02-26 Thread Victor Stinner
Hi,

PGO compilation is very slow. I tried very hard to avoid it.

I started to annotate the C code with various GCC attributes like
"inline", "always_inline", "hot", etc.. I also experimented
likely/unlikely Linux macros which use __builtin_expect(). At the
end... my efforts were worthless. I still had *major* issue (benchmark
*suddenly* 68% slower! WTF?) with code locality and I decided to give
up. You can still find some macros like _Py_HOT_FUNCTION and
_Py_NO_INLINE in Python ;-) (_Py_NO_INLINE is used to reduce stack
memory usage, that's a different story.)

My sad story with code placement:
https://vstinner.github.io/analysis-python-performance-issue.html

tl; dr Use PGO.

--

Since that time, I removed call_method from pyperformance to fix the
root issue: don't waste your time on micro-benchmarks ;-) ... But I
kept these micro-benchmarks in a different project:
https://github.com/vstinner/pymicrobench

For some specific needs (take a decision on a specific optimizaton),
sometimes micro-benchmarks are still useful ;-)

Victor

Le mar. 26 févr. 2019 à 23:31, Neil Schemenauer  a écrit :
>
> On 2019-02-26, Raymond Hettinger wrote:
> > That said, I'm only observing the effect when building with the
> > Mac default Clang (Apple LLVM version 10.0.0 (clang-1000.11.45.5).
> > When building GCC 8.3.0, there is no change in performance.
>
> My guess is that the code in _PyEval_EvalFrameDefault() got changed
> enough that Clang started emitting a bit different machine code.  If
> the conditional jumps are a bit different, I understand that could
> have a significant difference on performance.
>
> Are you compiling with --enable-optimizations (i.e. PGO)?  In my
> experience, that is needed to get meaningful results.  Victor also
> mentions that on his "how-to-get-stable-benchmarks" page.  Building
> with PGO is really (really) slow so I supect you are not doing it
> when bisecting.  You can speed it up greatly by using a simpler
> command for PROFILE_TASK in Makefile.pre.in.  E.g.
>
> PROFILE_TASK=$(srcdir)/my_benchmark.py
>
> Now that you have narrowed it down to a single commit, it would be
> worth doing the comparison with PGO builds (assuming Clang supports
> that).
>
> > That said, it seems to be compiler specific and only affects the
> > Mac builds, so maybe we can decide that we don't care.
>
> I think the key question is if the ceval loop got a bit slower due
> to logic changes or if Clang just happened to generate a bit worse
> code due to source code details.  A PGO build could help answer
> that.  I suppose trying to compare machine code is going to produce
> too large of a diff.
>
> Could you try hoisting the eval_breaker expression, as suggested by
> Antoine:
>
> https://discuss.python.org/t/profiling-cpython-with-perf/940/2
>
> If you think a slowdown affects most opcodes, I think the DISPATCH
> change looks like the only cause.  Maybe I missed something though.
>
> Also, maybe there would be some value in marking key branches as
> likely/unlikely if it helps Clang generate better machine code.
> Then, even if you compile without PGO (as many people do), you still
> get the better machine code.
>
> Regards,
>
>   Neil
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com



-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Possible performance regression

2019-02-26 Thread Neil Schemenauer
On 2019-02-26, Raymond Hettinger wrote:
> That said, I'm only observing the effect when building with the
> Mac default Clang (Apple LLVM version 10.0.0 (clang-1000.11.45.5).
> When building GCC 8.3.0, there is no change in performance.

My guess is that the code in _PyEval_EvalFrameDefault() got changed
enough that Clang started emitting a bit different machine code.  If
the conditional jumps are a bit different, I understand that could
have a significant difference on performance.

Are you compiling with --enable-optimizations (i.e. PGO)?  In my
experience, that is needed to get meaningful results.  Victor also
mentions that on his "how-to-get-stable-benchmarks" page.  Building
with PGO is really (really) slow so I supect you are not doing it
when bisecting.  You can speed it up greatly by using a simpler
command for PROFILE_TASK in Makefile.pre.in.  E.g.

PROFILE_TASK=$(srcdir)/my_benchmark.py

Now that you have narrowed it down to a single commit, it would be
worth doing the comparison with PGO builds (assuming Clang supports
that).

> That said, it seems to be compiler specific and only affects the
> Mac builds, so maybe we can decide that we don't care.

I think the key question is if the ceval loop got a bit slower due
to logic changes or if Clang just happened to generate a bit worse
code due to source code details.  A PGO build could help answer
that.  I suppose trying to compare machine code is going to produce
too large of a diff.

Could you try hoisting the eval_breaker expression, as suggested by
Antoine:

https://discuss.python.org/t/profiling-cpython-with-perf/940/2

If you think a slowdown affects most opcodes, I think the DISPATCH
change looks like the only cause.  Maybe I missed something though.

Also, maybe there would be some value in marking key branches as
likely/unlikely if it helps Clang generate better machine code.
Then, even if you compile without PGO (as many people do), you still
get the better machine code.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Possible performance regression

2019-02-26 Thread Victor Stinner
Le mar. 26 févr. 2019 à 22:45, Raymond Hettinger
 a écrit :
> Victor said he generally doesn't care about 5% regressions.  That makes sense 
> for odd corners of Python.  The reason I was concerned about this one is that 
> it hits the eval-loop and seems to effect every single op code.  The 
> regression applies somewhat broadly (increasing the cost of reading and 
> writing local variables by about 20%).  The effect is somewhat broad based.

I ignore changes smaller than 5% because they are usually what I call
the "noise" of the benchmark. It means that testing 3 commits give 3
different timings, even if the commits don't touch anything used in
the benchmark. There are multiple explanation: PGO compilation in not
deterministic, some benchmarks are too close to the performance of the
CPU L1-instruction cache and so are heavily impacted by the "code
locality" (exact address in memory), and many other things.

Hum, sometimes running the same benchmark on the same code on the same
hardware with the same strict procedure gives different timings at
each attempt.

At some point, I decided to give up on these 5% to not loose my mind :-)

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Possible performance regression

2019-02-26 Thread Raymond Hettinger


On Feb 25, 2019, at 8:23 PM, Eric Snow  wrote:
> 
> So it looks like commit ef4ac967 is not responsible for a performance
> regression.

I did narrow it down to that commit and I can consistently reproduce the timing 
differences.

That said, I'm only observing the effect when building with the Mac default 
Clang (Apple LLVM version 10.0.0 (clang-1000.11.45.5).   When building GCC 
8.3.0, there is no change in performance.

I conclude this is only an issue for Mac builds.

> I ran the "performance" suite (https://github.com/python/performance),
> which has 57 different benchmarks. 

Many of those benchmarks don't measure eval-loop performance.  Instead, they 
exercise json, pickle, sqlite etc.  So, I would expect no change in many of 
those because they weren't touched.

Victor said he generally doesn't care about 5% regressions.  That makes sense 
for odd corners of Python.  The reason I was concerned about this one is that 
it hits the eval-loop and seems to effect every single op code.  The regression 
applies somewhat broadly (increasing the cost of reading and writing local 
variables by about 20%).  The effect is somewhat broad based.

That said, it seems to be compiler specific and only affects the Mac builds, so 
maybe we can decide that we don't care.


Raymond

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Possible performance regression

2019-02-26 Thread Victor Stinner
I made an attempt once and it was faster:
https://faster-cpython.readthedocs.io/registervm.html

But I had bugs and I didn't know how to implement correctly a compiler.

Victor

Le mardi 26 février 2019, Neil Schemenauer  a
écrit :
> On 2019-02-25, Eric Snow wrote:
>> So it looks like commit ef4ac967 is not responsible for a performance
>> regression.
>
> I did a bit of exploration myself and that was my conclusion as
> well.  Perhaps others would be interested in how to use "perf" so I
> did a little write up:
>
> https://discuss.python.org/t/profiling-cpython-with-perf/940
>
> To me, it looks like using a register based VM could produce a
> pretty decent speedup.  Research project for someone. ;-)
>
> Regards,
>
>   Neil
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com
>

-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Possible performance regression

2019-02-26 Thread Neil Schemenauer
On 2019-02-25, Eric Snow wrote:
> So it looks like commit ef4ac967 is not responsible for a performance
> regression.

I did a bit of exploration myself and that was my conclusion as
well.  Perhaps others would be interested in how to use "perf" so I
did a little write up:

https://discuss.python.org/t/profiling-cpython-with-perf/940

To me, it looks like using a register based VM could produce a
pretty decent speedup.  Research project for someone. ;-)

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Possible performance regression

2019-02-26 Thread Victor Stinner
Hi,

Le mar. 26 févr. 2019 à 05:27, Eric Snow  a écrit :
> I ran the "performance" suite (https://github.com/python/performance),
> which has 57 different benchmarks.

Ah yes, by the way: I also ran manually performance on
speed.python.org yesterday: it added a new dot at Feb 25.

> In the results, 9 were marked as
> "significantly" different between the two commits..  2 of the
> benchmarks showed a marginal slowdown and 7 showed a marginal speedup:

I'm not surprised :-) Noise on micro-benchmark is usually "ignored by
the std dev" (delta included in the std dev).

At speed.python.org, you can see that basically the performances are
stable since last summer.

I let you have a look at https://speed.python.org/timeline/

> | Benchmark   | speed.before | speed.after | Change
> | Significance  |
> +=+==+=+==+===+
> | django_template | 177 ms   | 172 ms  | 1.03x faster
> | Significant (t=3.66)  |
> +-+--+-+--+---+
> | html5lib| 126 ms   | 122 ms  | 1.03x faster
> | Significant (t=3.46)  |
> +-+--+-+--+---+
> | json_dumps  | 17.6 ms  | 17.2 ms | 1.02x faster
> | Significant (t=2.65)  |
> +-+--+-+--+---+
> | nbody   | 157 ms   | 161 ms  | 1.03x slower
> | Significant (t=-3.85) |
(...)

Usually, I just ignore changes which are smaller than 5% ;-)

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Possible performance regression

2019-02-25 Thread Eric Snow
On Mon, Feb 25, 2019 at 10:42 AM Eric Snow  wrote:
> I'll look into it around then too.  See https://bugs.python.org/issue33608.

I ran the "performance" suite (https://github.com/python/performance),
which has 57 different benchmarks.  In the results, 9 were marked as
"significantly" different between the two commits..  2 of the
benchmarks showed a marginal slowdown and 7 showed a marginal speedup:

+-+--+-+--+---+
| Benchmark   | speed.before | speed.after | Change
| Significance  |
+=+==+=+==+===+
| django_template | 177 ms   | 172 ms  | 1.03x faster
| Significant (t=3.66)  |
+-+--+-+--+---+
| html5lib| 126 ms   | 122 ms  | 1.03x faster
| Significant (t=3.46)  |
+-+--+-+--+---+
| json_dumps  | 17.6 ms  | 17.2 ms | 1.02x faster
| Significant (t=2.65)  |
+-+--+-+--+---+
| nbody   | 157 ms   | 161 ms  | 1.03x slower
| Significant (t=-3.85) |
+-+--+-+--+---+
| pickle_dict | 29.5 us  | 30.5 us | 1.03x slower
| Significant (t=-6.37) |
+-+--+-+--+---+
| scimark_monte_carlo | 144 ms   | 139 ms  | 1.04x faster
| Significant (t=3.61)  |
+-+--+-+--+---+
| scimark_sparse_mat_mult | 5.41 ms  | 5.25 ms | 1.03x faster
| Significant (t=4.26)  |
+-+--+-+--+---+
| sqlite_synth| 3.99 us  | 3.91 us | 1.02x faster
| Significant (t=2.49)  |
+-+--+-+--+---+
| unpickle_pure_python| 497 us   | 481 us  | 1.03x faster
| Significant (t=5.04)  |
+-+--+-+--+---+

  (Issue #33608 has more detail.)

So it looks like commit ef4ac967 is not responsible for a performance
regression.

-eric
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Possible performance regression

2019-02-25 Thread Eric Snow
On Mon, Feb 25, 2019 at 10:32 AM Raymond Hettinger
 wrote:
> I got it down to two checkins before running out of time:
>
> Between
> git checkout 463572c8beb59fd9d6850440af48a5c5f4c0c0c9
>
> And:
> git checkout 3b0abb019662e42070f1d6f7e74440afb1808f03
>
> So the subinterpreter patch was likely the trigger.
>
> I can reproduce it over and over again on Clang, but not for a GCC-8 build, 
> so it is compiler specific (and possibly macOS specific).
>
> Will look at it more after work this evening.  I posted here to try to 
> solicit independent confirmation.

I'll look into it around then too.  See https://bugs.python.org/issue33608.

-eric
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Possible performance regression

2019-02-25 Thread Raymond Hettinger


> On Feb 25, 2019, at 2:54 AM, Antoine Pitrou  wrote:
> 
> Have you tried bisecting to find out the offending changeset, if there
> any?

I got it down to two checkins before running out of time:

Between
git checkout 463572c8beb59fd9d6850440af48a5c5f4c0c0c9  

And:
git checkout 3b0abb019662e42070f1d6f7e74440afb1808f03  

So the subinterpreter patch was likely the trigger.

I can reproduce it over and over again on Clang, but not for a GCC-8 build, so 
it is compiler specific (and possibly macOS specific).

Will look at it more after work this evening.  I posted here to try to solicit 
independent confirmation.


Raymond
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Possible performance regression

2019-02-25 Thread Antoine Pitrou
On Sun, 24 Feb 2019 20:54:02 -0800
Raymond Hettinger  wrote:
> I'll been running benchmarks that have been stable for a while.  But between 
> today and yesterday, there has been an almost across the board performance 
> regression.  

Have you tried bisecting to find out the offending changeset, if there
any?

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Possible performance regression

2019-02-25 Thread Victor Stinner
Hi,

Le lun. 25 févr. 2019 à 05:57, Raymond Hettinger
 a écrit :
> I'll been running benchmarks that have been stable for a while.  But between 
> today and yesterday, there has been an almost across the board performance 
> regression.

How do you run your benchmarks? If you use Linux, are you using CPU isolation?

> It's possible that this is a measurement error or something unique to my 
> system (my Mac installed the 10.14.3 release today), so I'm hoping other 
> folks can run checks as well.

Getting reproducible benchmark results on timing smaller than 1 ms is
really hard. I wrote some advices to get more stable results:
https://perf.readthedocs.io/en/latest/run_benchmark.html#how-to-get-reproductible-benchmark-results

> Variable and attribute read access:
>4.0 ns   read_local

In my experience, for timing less than 100 ns, *everything* impacts
the benchmark, and the result is useless without the standard
deviation.

On such microbenchmarks, the hash function hash a significant impact
on performance. So you should run your benchmark on multiple different
*processes* to get multiple different hash functions. Some people
prefer to use PYTHONHASHSEED=0 (or another value), but I dislike using
that since it's less representative of performance "on production"
(with randomized hash function). For example, using 20 processes to
test 20 randomized hash function is enough to compute the average cost
of the hash function. More remark was more general, I didn't look at
the specific case of var_access_benchmark.py. Maybe benchmarks on C
depend on the hash function.

For example, 4.0 ns +/- 10 ns or 4.0 ns +/- 0.1 ns is completely
different to decide if "5.0 ns" is slower to faster.

The "perf compare" command of my perf module "determines whether two
samples differ significantly using a Student’s two-sample, two-tailed
t-test with alpha equals to 0.95.":
https://en.wikipedia.org/wiki/Student's_t-test

I don't understand how these things work, I just copied the code from
the old Python benchmark suite :-)

See also my articles in my journey to stable benchmarks:

* https://vstinner.github.io/journey-to-stable-benchmark-system.html #
nosy applications / CPU isolation
* https://vstinner.github.io/journey-to-stable-benchmark-deadcode.html # PGO
* https://vstinner.github.io/journey-to-stable-benchmark-average.html
# randomized hash function

There are likely other parameters which impact benchmarks, that's why
std dev and how the benchmark matter so much.

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Possible performance regression

2019-02-25 Thread Raymond Hettinger


> On Feb 24, 2019, at 10:06 PM, Eric Snow  wrote:
> 
> I'll look into it in more depth tomorrow.  FWIW, I have a few commits
> in the range you described, so I want to make sure I didn't slow
> things down for us. :)

Thanks for looking into it.

FWIW, I can consistently reproduce the results several times in row.  Here's 
the bash script I'm using:

#!/bin/bash

make clean
./configure
make# Apple LLVM 
version 10.0.0 (clang-1000.11.45.5)

for i in `seq 1 3`;
do
git checkout d610116a2e48b55788b62e11f2e6956af06b3de0   # Go back to 2/23
make# Rebuild
sleep 30# Let the system 
get quiet and cool
echo ' baseline ---' >> results.txt # Label output
./python.exe Tools/scripts/var_access_benchmark.py >> results.txt   # Run 
benchmark

git checkout 16323cb2c3d315e02637cebebdc5ff46be32ecdf   # Go to end-of-day 
2/24
make# Rebuild
sleep 30# Let the system 
get quiet and cool
echo ' end of day ---' >> results.txt   # Label output
./python.exe Tools/scripts/var_access_benchmark.py >> results.txt   # Run 
benchmark


> 
> -eric
> 
> 
> * commit 175421b58cc97a2555e474f479f30a6c5d2250b0 (HEAD)
> | Author: Pablo Galindo 
> | Date:   Sat Feb 23 03:02:06 2019 +
> |
> | bpo-36016: Add generation option to gc.getobjects() (GH-11909)
> 
> $ ./python Tools/scripts/var_access_benchmark.py
> Variable and attribute read access:
>  18.1 ns   read_local
>  19.4 ns   read_nonlocal

These timings are several times larger than they should be.  Perhaps you're 
running a debug build?  Or perhaps 32-bit? Or on VM or some such.  Something 
looks way off because I'm getting 4 and 5 ns on my 2013 Haswell laptop.



Raymond









___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Possible performance regression

2019-02-24 Thread Eric Snow
On Sun, Feb 24, 2019 at 10:04 PM Eric Snow  wrote:
> I'll take a look tonight.

I made 2 successive runs of the script (on my laptop) for a commit
from early Saturday, and 2 runs from a commit this afternoon (close to
master).  The output is below, with the earlier commit first.  That
one is a little faster in places and a little slower in others.
However, I also saw quite a bit of variability in the results for the
same commit.  So I'm not sure what to make of it.

I'll look into it in more depth tomorrow.  FWIW, I have a few commits
in the range you described, so I want to make sure I didn't slow
things down for us. :)

-eric


* commit 175421b58cc97a2555e474f479f30a6c5d2250b0 (HEAD)
| Author: Pablo Galindo 
| Date:   Sat Feb 23 03:02:06 2019 +
|
| bpo-36016: Add generation option to gc.getobjects() (GH-11909)

$ ./python Tools/scripts/var_access_benchmark.py
Variable and attribute read access:
  18.1 ns   read_local
  19.4 ns   read_nonlocal
  48.3 ns   read_global
  52.4 ns   read_builtin
  55.7 ns   read_classvar_from_class
  56.1 ns   read_classvar_from_instance
  78.6 ns   read_instancevar
  67.6 ns   read_instancevar_slots
  65.9 ns   read_namedtuple
 106.1 ns   read_boundmethod

Variable and attribute write access:
  25.1 ns   write_local
  26.9 ns   write_nonlocal
^[[A
  78.0 ns   write_global
 154.1 ns   write_classvar
 132.0 ns   write_instancevar
  88.2 ns   write_instancevar_slots

Data structure read access:
  69.6 ns   read_list
  69.0 ns   read_deque
  68.4 ns   read_dict

Data structure write access:
  73.2 ns   write_list
  79.0 ns   write_deque
 103.5 ns   write_dict

Stack (or queue) operations:
 348.3 ns   list_append_pop
 169.0 ns   deque_append_pop
 170.8 ns   deque_append_popleft

Timing loop overhead:
   1.3 ns   loop_overhead
$ ./python Tools/scripts/var_access_benchmark.py
Variable and attribute read access:
  17.7 ns   read_local
  19.2 ns   read_nonlocal
  39.9 ns   read_global
  50.3 ns   read_builtin
  54.4 ns   read_classvar_from_class
  55.8 ns   read_classvar_from_instance
  80.3 ns   read_instancevar
  70.7 ns   read_instancevar_slots
  66.1 ns   read_namedtuple
 108.9 ns   read_boundmethod

Variable and attribute write access:
  25.1 ns   write_local
  25.6 ns   write_nonlocal
  70.0 ns   write_global
 151.5 ns   write_classvar
 133.9 ns   write_instancevar
  90.7 ns   write_instancevar_slots

Data structure read access:
 140.7 ns   read_list
  89.6 ns   read_deque
  86.6 ns   read_dict

Data structure write access:
  97.9 ns   write_list
 100.5 ns   write_deque
 120.0 ns   write_dict

Stack (or queue) operations:
 375.9 ns   list_append_pop
 179.3 ns   deque_append_pop
 179.4 ns   deque_append_popleft

Timing loop overhead:
   1.5 ns   loop_overhead

* commit 3b0abb019662e42070f1d6f7e74440afb1808f03 (HEAD)
| Author: Giampaolo Rodola 
| Date:   Sun Feb 24 15:46:40 2019 -0800
|
| bpo-33671: allow setting shutil.copyfile() bufsize globally (GH-12016)

$ ./python Tools/scripts/var_access_benchmark.py
Variable and attribute read access:
  20.2 ns   read_local
  20.0 ns   read_nonlocal
  41.9 ns   read_global
  52.9 ns   read_builtin
  56.3 ns   read_classvar_from_class
  56.9 ns   read_classvar_from_instance
  80.2 ns   read_instancevar
  70.6 ns   read_instancevar_slots
  69.5 ns   read_namedtuple
 114.5 ns   read_boundmethod

Variable and attribute write access:
  23.4 ns   write_local
  25.0 ns   write_nonlocal
  74.5 ns   write_global
 152.0 ns   write_classvar
 131.7 ns   write_instancevar
  90.1 ns   write_instancevar_slots

Data structure read access:
  69.9 ns   read_list
  73.4 ns   read_deque
  77.8 ns   read_dict

Data structure write access:
  83.3 ns   write_list
  94.9 ns   write_deque
 120.6 ns   write_dict

Stack (or queue) operations:
 383.4 ns   list_append_pop
 187.1 ns   deque_append_pop
 182.2 ns   deque_append_popleft

Timing loop overhead:
   1.4 ns   loop_overhead
$ ./python Tools/scripts/var_access_benchmark.py
Variable and attribute read access:
  19.1 ns   read_local
  20.9 ns   read_nonlocal
  43.8 ns   read_global
  57.8 ns   read_builtin
  58.4 ns   read_classvar_from_class
  61.3 ns   read_classvar_from_instance
  84.7 ns   read_instancevar
  72.9 ns   read_instancevar_slots
  69.7 ns   read_namedtuple
 109.9 ns   read_boundmethod

Variable and attribute write access:
  23.1 ns   write_local
  23.7 ns   write_nonlocal
  72.8 ns   write_global
 149.9 ns   write_classvar
 133.3 ns   write_instancevar
  89.4 ns   write_instancevar_slots

Data structure read access:
  69.0 ns   read_list
  69.6 ns   read_deque
  69.1 ns   read_dict

Data structure write 

Re: [Python-Dev] Possible performance regression

2019-02-24 Thread Eric Snow
I'll take a look tonight.

-eric

On Sun, Feb 24, 2019, 21:54 Raymond Hettinger 
wrote:

> I'll been running benchmarks that have been stable for a while.  But
> between today and yesterday, there has been an almost across the board
> performance regression.
>
> It's possible that this is a measurement error or something unique to my
> system (my Mac installed the 10.14.3 release today), so I'm hoping other
> folks can run checks as well.
>
>
> Raymond
>
>
> -- Yesterday
> 
>
> $ ./python.exe Tools/scripts/var_access_benchmark.py
> Variable and attribute read access:
>4.0 ns   read_local
>4.5 ns   read_nonlocal
>   13.1 ns   read_global
>   17.4 ns   read_builtin
>   17.4 ns   read_classvar_from_class
>   15.8 ns   read_classvar_from_instance
>   24.6 ns   read_instancevar
>   19.7 ns   read_instancevar_slots
>   18.5 ns   read_namedtuple
>   26.3 ns   read_boundmethod
>
> Variable and attribute write access:
>4.6 ns   write_local
>4.8 ns   write_nonlocal
>   17.5 ns   write_global
>   39.1 ns   write_classvar
>   34.4 ns   write_instancevar
>   25.3 ns   write_instancevar_slots
>
> Data structure read access:
>   17.5 ns   read_list
>   18.4 ns   read_deque
>   19.2 ns   read_dict
>
> Data structure write access:
>   19.0 ns   write_list
>   22.0 ns   write_deque
>   24.4 ns   write_dict
>
> Stack (or queue) operations:
>   55.5 ns   list_append_pop
>   46.3 ns   deque_append_pop
>   46.7 ns   deque_append_popleft
>
> Timing loop overhead:
>0.3 ns   loop_overhead
>
>
> -- Today
> ---
>
> $ ./python.exe py Tools/scripts/var_access_benchmark.py
>
> Variable and attribute read access:
>5.0 ns   read_local
>5.3 ns   read_nonlocal
>   14.7 ns   read_global
>   18.6 ns   read_builtin
>   19.9 ns   read_classvar_from_class
>   17.7 ns   read_classvar_from_instance
>   26.1 ns   read_instancevar
>   21.0 ns   read_instancevar_slots
>   21.7 ns   read_namedtuple
>   27.8 ns   read_boundmethod
>
> Variable and attribute write access:
>6.1 ns   write_local
>7.3 ns   write_nonlocal
>   18.9 ns   write_global
>   40.7 ns   write_classvar
>   36.2 ns   write_instancevar
>   26.1 ns   write_instancevar_slots
>
> Data structure read access:
>   19.1 ns   read_list
>   19.6 ns   read_deque
>   20.6 ns   read_dict
>
> Data structure write access:
>   22.8 ns   write_list
>   23.5 ns   write_deque
>   27.8 ns   write_dict
>
> Stack (or queue) operations:
>   54.8 ns   list_append_pop
>   49.5 ns   deque_append_pop
>   49.4 ns   deque_append_popleft
>
> Timing loop overhead:
>0.3 ns   loop_overhead
>
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/ericsnowcurrently%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Possible performance regression

2019-02-24 Thread Raymond Hettinger
I'll been running benchmarks that have been stable for a while.  But between 
today and yesterday, there has been an almost across the board performance 
regression.  

It's possible that this is a measurement error or something unique to my system 
(my Mac installed the 10.14.3 release today), so I'm hoping other folks can run 
checks as well.


Raymond


-- Yesterday 


$ ./python.exe Tools/scripts/var_access_benchmark.py
Variable and attribute read access:
   4.0 ns   read_local
   4.5 ns   read_nonlocal
  13.1 ns   read_global
  17.4 ns   read_builtin
  17.4 ns   read_classvar_from_class
  15.8 ns   read_classvar_from_instance
  24.6 ns   read_instancevar
  19.7 ns   read_instancevar_slots
  18.5 ns   read_namedtuple
  26.3 ns   read_boundmethod

Variable and attribute write access:
   4.6 ns   write_local
   4.8 ns   write_nonlocal
  17.5 ns   write_global
  39.1 ns   write_classvar
  34.4 ns   write_instancevar
  25.3 ns   write_instancevar_slots

Data structure read access:
  17.5 ns   read_list
  18.4 ns   read_deque
  19.2 ns   read_dict

Data structure write access:
  19.0 ns   write_list
  22.0 ns   write_deque
  24.4 ns   write_dict

Stack (or queue) operations:
  55.5 ns   list_append_pop
  46.3 ns   deque_append_pop
  46.7 ns   deque_append_popleft

Timing loop overhead:
   0.3 ns   loop_overhead


-- Today 
---

$ ./python.exe py Tools/scripts/var_access_benchmark.py

Variable and attribute read access:
   5.0 ns   read_local
   5.3 ns   read_nonlocal
  14.7 ns   read_global
  18.6 ns   read_builtin
  19.9 ns   read_classvar_from_class
  17.7 ns   read_classvar_from_instance
  26.1 ns   read_instancevar
  21.0 ns   read_instancevar_slots
  21.7 ns   read_namedtuple
  27.8 ns   read_boundmethod

Variable and attribute write access:
   6.1 ns   write_local
   7.3 ns   write_nonlocal
  18.9 ns   write_global
  40.7 ns   write_classvar
  36.2 ns   write_instancevar
  26.1 ns   write_instancevar_slots

Data structure read access:
  19.1 ns   read_list
  19.6 ns   read_deque
  20.6 ns   read_dict

Data structure write access:
  22.8 ns   write_list
  23.5 ns   write_deque
  27.8 ns   write_dict

Stack (or queue) operations:
  54.8 ns   list_append_pop
  49.5 ns   deque_append_pop
  49.4 ns   deque_append_popleft

Timing loop overhead:
   0.3 ns   loop_overhead


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com