On Mon, 29 Jun 2015 at 14:02 Ozan Çağlayan <ozan...@gmail.com> wrote:
> Hi, > > I just downloaded PyPy 2.6.0 just to play with it. > > > I have a simple line-by-line file reading example where the file is 324MB. > > Code: > > # Not doing this import crashes PyPy with MemoryError?? > from io import open > > a = 0 > f = open(fname) > for line in f.readlines(): > a += len(line) > f.close() > > PyPy: > Python 2.7.9 (295ee98b69288471b0fcf2e0ede82ce5209eb90b, Jun 01 2015, > 17:30:13) > [PyPy 2.6.0 with GCC 4.9.2] on linux2 > > real 0m6.068s > user 0m4.582s > sys 0m0.846s > > CPython (2.7.10) > > real 0m3.799s > user 0m2.851s > sys 0m0.860s > > Am I doing something wrong or is this expected? > I tested this with cpython 2.7 and pypy 2.7 and I found that it was 2x slower as you say. It seems that readlines() is somehow slower in pypy. You don't actually need to call readlines() in this case though and it's faster not to. With your code (although I didn't import io.open) I found the timings: CPython 2.7: 1.4s PyPy 2.7: 2.3s I changed it to for line in f: # (not f.readlines()) a += len(line) With that change I get: CPython 2.7: 1.3s PyPy 2.7: 0.6s So calling readlines() makes it slower in both CPython and PyPy. If you don't call readlines() then PyPy is 2x faster than CPython (at least on this machine). Probably also the reason for the MemoryError was that you are using readlines(). The reason for this is that readlines() reads all of the lines of the file into memory as a list of Python string objects. If you just loop over f directly then it reads the file one line at a time and so it requires much less memory. I don't know how much spare RAM you have but if you got MemoryError it suggests that you don't have enough to load the whole file into memory using readlines(). Note that even though this machine has 8GB if RAM (with over 7GB unused) and can load the file into memory quite comfortably I still wouldn't write code that assumed it was okay to load a 324MB file into memory unless there was some actual need to do that. -- Oscar
_______________________________________________ pypy-dev mailing list pypy-dev@python.org https://mail.python.org/mailman/listinfo/pypy-dev