Thanks! I ran it again on a much larger input and let it print the lines/sec speed on every millionth line (either valid or invalid).
SPEED 6588 l/s SPEED 8208 l/s SPEED 9172 l/s SPEED 10351 l/s SPEED 16946 l/s SPEED 23263 l/s 662.6 secs, 973701 valid lines (5610778 invalid), 9937 l/s, max density 73 l/s [1c3dac321147] {jit-summary Tracing: 2794 8.313955 Backend: 2245 1.946692 TOTAL: 667.678971 ops: 5768705 recorded ops: 1478597 calls: 231321 guards: 392450 opt ops: 456372 opt guards: 101057 opt guards shared: 61039 forcings: 0 abort: trace too long: 52 abort: compiling: 0 abort: vable escape: 497 abort: bad loop: 0 abort: force quasi-immut: 0 nvirtuals: 284152 nvholes: 146657 nvreused: 90634 vecopt tried: 0 vecopt success: 0 Total # of loops: 583 Total # of bridges: 1778 Freed # of loops: 140 Freed # of bridges: 189 [1c3dac33785b] jit-summary} CPython again for comparison on the same input: SPEED 8819 l/s SPEED 9625 l/s SPEED 10285 l/s SPEED 11384 l/s SPEED 16428 l/s SPEED 20588 l/s 596.8 secs, 973701 valid lines (5610778 invalid), 11032 l/s, max density 73 l/s Interesting that after 5 million lines the PyPy speed exceeded the CPython somehow. Both runs got faster with time, probably due to my insane level of local caching of values (less SQL required). Anyway, I still hesitate whether pypy was really still warming up all that time... Thanks, Vlada On 31.3.2017 09:58, Maciej Fijalkowski wrote: > What I meant is that ORM is slow *and* it takes forever to warmup. > Your code might not run long enough for the ORM to be warm. It's also > very likely it'll end up slower on pypy. one thing you can do is to > run PYPYLOG=jit-summary:- pypy <your program> and copy paste the > summary output > > The only way to store the warmed up state is to keep the process alive > (as a daemon) and rerun it further. You can see if it speeds up after > two or three runs in one process and make decisions accordingly. > > On Thu, Mar 30, 2017 at 2:09 PM, Vláďa Macek <ma...@sandbox.cz> wrote: >> Hi Maciej (and others?), >> >> I know I must be one of many who wanted a gain without pain. :-) Just gave >> it a try without having an opportunity for some deeper profiling due to my >> project deadlines. I just thought to get in touch in case I missed >> something apparent to you from the combination I reported. >> >> ORM might me slow, but I compare interpreters, not ORMs. Here's my >> program's final stats of processing the input file (nginx access log): >> >> CPython 2.7.6 32bit >> 130.1 secs, 177492 valid lines (866160 invalid), 8021 l/s, max density 72 l/s >> >> pypy2-v5.7.0-linux32 >> 183.0 secs, 177492 valid lines (866160 invalid), 5703 l/s, max density 72 l/s >> >> This is longer run than what I tried previously and surely this is not a >> "double time". But still significantly slower. >> >> Each line is analyzed using a regexp, which I read is slow in pypy. >> >> Both runs have exactly same input and output. Subjectively, the processing >> debugging output really got faster gradually for pypy, cpython is constant >> speed. Is it normal that the warmup can take minutes? I don't know the >> details. >> >> In production, this processing is run from cron every five minutes. Is it >> possible to store the warmed-up state between runs? (Note: I have *.pyc >> files disabled at home using PYTHONDONTWRITEBYTECODE=1.) >> >> I know it's annoying I don't share code and I'm sorry. With this mail I >> just wanted to give out some numbers for the possibly curious. >> >> The pypy itself is interesting and I hope I'll return to it someday more >> thoroughly. >> >> Thanks again & have a nice day, >> >> Vláďa >> >> >> On 27.3.2017 17:21, Maciej Fijalkowski wrote: >>> Hi Vlada >>> >>> Generally speaking, if we can't have a look there is incredibly little >>> we can do "I have a program" can be pretty much anything. >>> >>> It is well known that django ORM is very slow (both on pypy and on >>> cpython) and makes the JIT take forever to warm up. I have absolutely >>> no idea how long is your run at full CPU, but this is definitely one >>> of your suspects >>> >>> On Sun, Mar 26, 2017 at 1:06 PM, Vláďa Macek <ma...@sandbox.cz> wrote: >>>> Hi, recently I asked my friends to run my sort of a benchmark on their >>>> machines (attached). The goal was to test the speed of different data >>>> access in python2 and python3, 32bit and 64bit. One of my friends sent me >>>> the pypy results -- the script ran fast as hell! Astounding. >>>> >>>> At home I have a 64bit Dell laptop running 32bit Ubuntu 14.04. I downloaded >>>> your binary >>>> https://bitbucket.org/pypy/pypy/downloads/pypy2-v5.7.0-linux32.tar.bz2 and >>>> confirmed my friend's results, wow. >>>> >>>> I develop a large Django project, that includes a big amount of background >>>> data processing. Reads large files, computes, issues much SQL to postgresql >>>> via psycopg2, every 5 minutes. Heavily uses memcache daemon between runs. >>>> >>>> I'd welcome a speedup here very much. >>>> >>>> So let's give it a try. Installed psycopg2cffi (via pip in virtualenv), set >>>> up the paths and ran. The computation printouts were the same, very >>>> promising -- taking into account how complicated the project is! The SQL >>>> looked right too. My respect on compatiblity! >>>> >>>> Unfortunately, the time needed to complete was double in comparison CPython >>>> 2.7 for exactly the same task. >>>> >>>> You mention you might have some tips for why it's slow. Are you interested >>>> in getting in touch? Although I rather can't share the code and data with >>>> you, I'm offering a real world example of significant load that might help >>>> Pypy get better. >>>> >>>> Thank you, >>>> >>>> -- >>>> : Vlada Macek : http://macek.sandbox.cz : +420 608 978 164 >>>> : UNIX && Dev || Training : Python, Django : PGP key 97330EBD >>>> >>>> (Disclaimer: The opinions expressed herein are not necessarily those >>>> of my employer, not necessarily mine, and probably not necessary.) >>>> -- : Vlada Macek : http://macek.sandbox.cz : +420 608 978 164 : UNIX && Dev || Training : Python, Django : PGP key 97330EBD (Disclaimer: The opinions expressed herein are not necessarily those of my employer, not necessarily mine, and probably not necessary.) _______________________________________________ pypy-dev mailing list pypy-dev@python.org https://mail.python.org/mailman/listinfo/pypy-dev