I found one operation leading to the speed decrement. When inputs are a list of dictionaries, pandas will firstly convert them to a series of numpy.ndarray. This conversion might be slow when using PyPy.
I made a small example inside the micro benchmark project like this: ==================== def bench_build_np(): with Timer('np.array'): for i in xrange(N//10): myarray = np.array([i]) ==================== On PyPy, it took about 2.3s, and on CPython it took about 0.22s. I think I should try to find out why PyPy is slow on such condition. But though I have read the docs for several times, I am not familiar with the details of PyPy project. Could you please give me some advice on why dynamically creating Numpy arrays is slow, based on how PyPy works? _______________________________________________ pypy-dev mailing list -- pypy-dev@python.org To unsubscribe send an email to pypy-dev-le...@python.org https://mail.python.org/mailman3/lists/pypy-dev.python.org/ Member address: arch...@mail-archive.com