Hi,

On Wed, 19 Nov 2014 22:59:05 +0300 Andrew Savchenko wrote:
> On Mon, 17 Nov 2014 21:55:48 -0800 Zac Medico wrote:
[...]
> > > When I'll manage to run emerge -DNupv @world without errors, I'll
> > > send you stats for both runs with and without dynamic deps.
> > 
> > Great, hopefully that will reveal some more good things to optimize.
> > 
> > > By the way, do you need pstats files (e.g. for some extra data) or
> > > pdf graphs are sufficient?
> > 
> > The pdf graphs are typically enough for me, since they highlight the hot
> > spots really well. I did not even bother with your pstats files.
> 
> OK. I managed to run emerge -DNupv @world on desktop without
> conflicts. What was done:
> 1) fixpacgkages run
> 2) portage is updated to use your patch from bug 529660
> 
> At this point performance boost was really great: from ~35
> minutes to ~19-20 minutes.
> 
> Afterward I tried emerge -DNupv @world with different python
> versions:
>      (2.7)  (~)2.7.8
>      (3.3)  3.3.5-r1
>      (3.4)  (~)3.4.2
> 
> Results are interesting (confidence level for error is 0.95, time
> real value was used for calculations):
> 3.3 is 3% ± 5% faster than 2.7
> 3.4 is 20% ± 5% faster than 3.3
> And with python:3.4 and steps above it takes now 15.5 minutes
> instead of 35. Nice result :)
> 
> So there is no evidence that portage on 3.3 is faster than on 2.7,
> but 3.4 is faster than 3.3 with very good confidence. Of course
> this data is biased by -m cProfile overhead, but bias should
> similar for each version. Just checked time to run command for
> python:3.4 without profiling: it takes 11.5 minutes!
> 
> You may find generated pdf graphs together with system information
> for each host here:
> ftp://brunestud.info/gentoo/portage-v2.tar.xz
> 
> As for hitomi box, it is both slower and have much older packages,
> so I'm still struggling to fix conflicts and other issues. Results
> will be available later.

I managed to get data from hitomi too, see:
ftp://brunestud.info/gentoo/portage-v3.tar.xz
(this archive also contains all graphs previously obtained)

Graphs were obtained the same way as on desktop.
Portage and python versions are the same, time information follows
for _profiled_ runs:

>      (2.7)  (~)2.7.8
real    55m19.892s
user    39m11.913s
sys     15m37.586s

>      (3.3)  3.3.5-r1
real    52m34.640s
user    36m45.325s
sys     15m25.663s

>      (3.4)  (~)3.4.2
real    53m32.657s
user    37m12.369s
sys     15m52.641s

Without profiling using 3.3.5-r1:
real    25m50.439s
user    25m28.260s
sys     0m7.863s

This is quite surprising. On hitomi (Intel Atom N270) there is no
difference between 3.3 and 3.4, but both are slightly better than
2.7. (To exclude possible cache issues I made a blank run before
first test run.) Probably some arch-dependent optimizations.

What surprises me most is that profiling overhead is huge (~105%)
compared to overhead on desktop (~35%). CPU speeds are not that
different, instruction sets too (Atom has sse2, sse3, ssse3, but
lacks 3dnow, 3dnowext). L2 cache is the same (512КB), but L1
differs significantly: 64 KB data/64 KB instruction cache vs 24 KB
data/32 KB instruction cache. Look like this is the reason.

Best regards,
Andrew Savchenko

Attachment: pgp_Znd28i1U3.pgp
Description: PGP signature

Reply via email to