On Sun, Mar 1, 2009 at 11:29 AM, Michael Gilbert <michael.s.gilb...@gmail.com> wrote: > On Sun, 1 Mar 2009 16:12:14 -0500 Gideon Simpson wrote: > >> So I have some data sets of about 160000 floating point numbers stored >> in text files. I find that loadtxt is rather slow. Is this to be >> expected? Would it be faster if it were loading binary data? > > i have run into this as well. loadtxt uses a python list to allocate > memory for the data it reads in, so once you get to about 1/4th of your > available memory, it will start allocating the updated list (every > time it reads a new value from your data file) in swap instead of main > memory, which is rediculously slow (in fact it causes my system to be > quite unresponsive and a jumpy cursor). i have rewritten loadtxt to be > smarter about allocating memory, but it is slower overall and doesn't > support all of the original arguments/options (yet). i have some > ideas to make it smarter/more efficient, but have not had the time > to work on it recently. > > i will send the current version to the list tomorrow when i have access > to the system that it is on. > > best wishes, > mike > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion@scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion >
to address the slowness, i use wrappers around savetxt/loadtxt that save/load a .npy file along with/instead of the .txt file. -- and the loadtxt wrapper checks if the .npy is up-to-date. code here: http://rafb.net/p/dGBJjg80.html of course it's still slow the first time. i look forward to your speedups. -brentp _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion