On Mon, 29 Jun 2015 at 14:02 Ozan Çağlayan <ozan...@gmail.com> wrote:

> Hi,
>
> I just downloaded PyPy 2.6.0 just to play with it.
>
>
> I have a simple line-by-line file reading example where the file is 324MB.
>
> Code:
>
> # Not doing this import crashes PyPy with MemoryError??
> from io import open
>

> a = 0
> f = open(fname)
> for line in f.readlines():
>   a += len(line)
> f.close()
>
> PyPy:
> Python 2.7.9 (295ee98b69288471b0fcf2e0ede82ce5209eb90b, Jun 01 2015,
> 17:30:13)
> [PyPy 2.6.0 with GCC 4.9.2] on linux2
>
> real 0m6.068s
> user 0m4.582s
> sys 0m0.846s
>
> CPython (2.7.10)
>
> real 0m3.799s
> user 0m2.851s
> sys 0m0.860s
>
> Am I doing something wrong or is this expected?
>

I tested this with cpython 2.7 and pypy 2.7 and I found that it was 2x
slower as you say. It seems that readlines() is somehow slower in pypy. You
don't actually need to call readlines() in this case though and it's faster
not to.

With your code (although I didn't import io.open) I found the timings:
CPython 2.7: 1.4s
PyPy 2.7: 2.3s

I changed it to
for line in f: # (not f.readlines())
    a += len(line)

With that change I get:
CPython 2.7: 1.3s
PyPy 2.7: 0.6s

So calling readlines() makes it slower in both CPython and PyPy. If you
don't call readlines() then PyPy is 2x faster than CPython (at least on
this machine).

Probably also the reason for the MemoryError was that you are using
readlines(). The reason for this is that readlines() reads all of the lines
of the file into memory as a list of Python string objects. If you just
loop over f directly then it reads the file one line at a time and so it
requires much less memory. I don't know how much spare RAM you have but if
you got MemoryError it suggests that you don't have enough to load the
whole file into memory using readlines().

Note that even though this machine has 8GB if RAM (with over 7GB unused)
and can load the file into memory quite comfortably I still wouldn't write
code that assumed it was okay to load a 324MB file into memory unless there
was some actual need to do that.

--
Oscar
_______________________________________________
pypy-dev mailing list
pypy-dev@python.org
https://mail.python.org/mailman/listinfo/pypy-dev

Reply via email to