On Tue, Aug 9, 2011 at 5:07 AM, Josh Ayers <josh.ay...@gmail.com> wrote: > > I generated a 1 million element array using the built-in array module, wrote > it to disk, and then read it back in. See http://pastie.org/2342676 for the > code. > > Each operation was slower with PyPy than with CPython. > > * Array creation - CPython: 0.16s - PyPy: 0.47s
Whats taking time here is to iterate over the range-list and unwrapping all the integers. If all you want is to allocate an array it's significantly faster (both on pypy and on cpython) to do: a = array.array(outputDataType,[0]) * dataSize > * Writing file - CPython: 0.05s - PyPy: 0.11s > * Reading file: CPython: 0.02s - PyPy: 0.08s The builtin array module uses space.call_method(w_f, 'write') and space.call_method(w_f, 'read') to implement fromfile and tofile. For fromfile that means copying the data atleast once, and maybe that's whats going on with tofile too. I dont know how hard it would be to add some fast path for common cases that reads/writes data directly into the array buffer? > > This method won't quite work for me in any case - I need to store 64 bit > integers, and the built-in array module doesn't support them. To get around > that, I modified the pure-python array.py that comes in the pypy\lib_pypy > directory. I added a "q" to the end of the line "TYPECODES = ..." which > represents a 64 bit signed integer within the struct module. I saved that > modified file as array2.py and imported it in place of the built-in array. > See http://pastie.org/2342721 for the code. > > That allowed me to use a 64 bit integer, but the array creation step was > again much slower on PyPy than it was on CPython. The disk accessing steps > were more similar, and are probably at about the limit of the hard disk > anyway, but creating the array takes much longer under PyPy. Why do you think this is limited by the harddisk? I would imagine this approach to be slower than using the builtin module. Did you try this approach with a datatype supported by the built in module to compare the performance of the two approaches? -- Håkan Ardö _______________________________________________ pypy-dev mailing list pypy-dev@python.org http://mail.python.org/mailman/listinfo/pypy-dev