2013/2/18 Eleytherios Stamatogiannakis <est...@gmail.com> > On 18/02/13 18:44, Maciej Fijalkowski wrote: > >> On Mon, Feb 18, 2013 at 6:20 PM, Eleytherios Stamatogiannakis >> <est...@gmail.com> wrote: >> >>> We have found another (very simple) madIS query where PyPy is around 250x >>> slower that CPython: >>> >>> CPython: 314msec >>> PyPy: 1min 16sec >>> >>> The query if you would like to test it yourself is the following: >>> >>> select count(*) from (file 'some_big_text_file.txt' limit 100000); >>> >>> To run it you'll need some big text file containing at least 100000 text >>> lines (we have run above query with a very big XML file). You can also >>> run >>> above query with a lower limit (the behaviour will be the same) as such: >>> >>> select count(*) from (file 'some_big_text_file.txt' limit 10000); >>> >>> Be careful for the file to not have a csv, tsv, json, db or gz ending >>> because a different code path inside the "file" operator will be taken >>> than >>> the one for simple text files. >>> >>> l. >>> >>> >>> ______________________________**_________________ >>> pypy-dev mailing list >>> pypy-dev@python.org >>> http://mail.python.org/**mailman/listinfo/pypy-dev<http://mail.python.org/mailman/listinfo/pypy-dev> >>> >> >> Hey >> >> I would be incredibly convinient if you can change it to be a >> standalone benchmark (say reading large string from a file and >> decoding it in a whole or in pieces); >> >> > As it involves SQLite, CFFI and Python, it is very hard to extract the > full execution path that madIS goes through even in a simple query like > this. > > Nevertheless we extracted a part of the pure Python execution path, and > PyPy is around 50% slower than CPython: > > CPython: 21 sec > PyPy: 33 sec > > The full madIS execution path involves additional CFFI calls and callbacks > (from SQLite) to pass the data to SQLite. > > To run the test.py: > > test.py big_text_file >
Most of the time is spent in file iteration. I added f = f.read().splitlines() and the query is almost instant. -- Amaury Forgeot d'Arc
_______________________________________________ pypy-dev mailing list pypy-dev@python.org http://mail.python.org/mailman/listinfo/pypy-dev