Re: preallocate list
John Machin wrote: On Wed, 13 Apr 2005 14:28:51 +0100, Jim <[EMAIL PROTECTED]> wrote: Thanks for the suggestions. I guess I must ensure that this is my bottle neck. def readFactorsIntoList(self,filename,numberLoads): 1. "numberLoads" is not used. factors = [] f = open(self.basedir + filename,'r') line = f.readline() tokens = line.split() columns = len(tokens) if int(columns) == number: 2. "columns" is already an int (unless of course you've redefined "len"!). Doing int(columns) is pointless. 3. What is "number"? Same as "numberLoads"? 4. Please explain in general what is the layout of your file and in particular, what is the significance of the first line of the file and of the above "if" test. for line in f: factor = [] tokens = line.split() for i in tokens: factor.append(float(i)) 4. "factor" is built and then not used any more?? factors.append(loadFactor) 5. What is "loadFactor"? Same as "factor"? else: for line in f: tokens = line.split() factors.append([float(tokens[0])] * number) 6. You throw away any tokens in the line after the first?? return factors OK. I've just tried with 4 lines and the code works. Which code works? The code you posted? Please define "works". With 11000 lines it uses all CPU for at least 30 secs. There must be a better way. Perhaps after you post the code that you've actually run, and explained what your file layout is, and what you are trying to achieve, then we can give you some meaningful help. Cheers, John Thanks for looking John. For that I should take a little time to explain. I tried to rename the variables, some of them were four words long. I got a couple of the renames wrong. Sorry. Regarding 'works'. I meant that with a text file of four lines the code completed. With my desired size 11000 lines it didn't complete within the limits of my patience. I didn't try any other size. Also I perhaps wrongly use the newsgroup threads paradigm in trying to restart my query with extra information (that turned out a little faulty). Luckily the other branches yielded fruit. Thanks again Jim -- http://mail.python.org/mailman/listinfo/python-list
Re: preallocate list
On Wed, 13 Apr 2005 14:28:51 +0100, Jim <[EMAIL PROTECTED]> wrote: >Thanks for the suggestions. I guess I must ensure that this is my bottle >neck. > > def readFactorsIntoList(self,filename,numberLoads): 1. "numberLoads" is not used. > factors = [] > f = open(self.basedir + filename,'r') > line = f.readline() > tokens = line.split() > columns = len(tokens) > if int(columns) == number: 2. "columns" is already an int (unless of course you've redefined "len"!). Doing int(columns) is pointless. 3. What is "number"? Same as "numberLoads"? 4. Please explain in general what is the layout of your file and in particular, what is the significance of the first line of the file and of the above "if" test. > for line in f: > factor = [] > tokens = line.split() > for i in tokens: > factor.append(float(i)) 4. "factor" is built and then not used any more?? > factors.append(loadFactor) 5. What is "loadFactor"? Same as "factor"? > else: > for line in f: > tokens = line.split() > factors.append([float(tokens[0])] * number) 6. You throw away any tokens in the line after the first?? > return factors > > >OK. I've just tried with 4 lines and the code works. Which code works? The code you posted? Please define "works". > With 11000 lines it >uses all CPU for at least 30 secs. There must be a better way. Perhaps after you post the code that you've actually run, and explained what your file layout is, and what you are trying to achieve, then we can give you some meaningful help. Cheers, John -- http://mail.python.org/mailman/listinfo/python-list
Re: preallocate list
Bill Mill <[EMAIL PROTECTED]> writes: > Bill Mill <[EMAIL PROTECTED]> writes: > >> I would profile your app to see that it's your append which is taking >> ages, but to preallocate a list of strings would look like: >> >> ["This is an average length string" for i in range(approx_length)] I don't think there's any point putting strings into the preallocated list. A list is just an array of pointers to objects, so any object will do fine for preallocation, no matter what the list will be used for. >> My guess is that it won't help to preallocate, but time it and let us >> know. A test to back my guess: >> >> import timeit, math >> >> def test1(): >> lst = [0 for i in range(10)] >> for i in xrange(10): >> lst[i] = math.sin(i) * i >> >> def test2(): >> lst = [] >> for i in xrange(10): >> lst.append(math.sin(i) * i) ... > The results change slightly when I actually insert an integer, instead > of a float, with lst[i] = i and lst.append(i): > > 09:14 AM ~$ python test.py > time1: 3.352000 > time2: 3.672000 If you use lst = range(10) or even better lst = [None]*10 then test1 is more than twice as fast as test2: time1: 2.437730 time2: 5.308054 (using python 2.4). Your code lst = [0 for i in range(10)] made python do an extra 10-iteration loop. Dan -- http://mail.python.org/mailman/listinfo/python-list
Re: preallocate list
ivec = n*[None] so that if I use a list element before intializing it, for example ivec[0] += 1 I get an error message File "xxnone.py", line 2, in ? ivec[0] += 1 TypeError: unsupported operand type(s) for +=: 'NoneType' and 'int' This is in the same spirit as Python's (welcome) termination of a program when one tries to use an uninitalized scalar variable. I feel foolish that I forgot about *. I've just started with Python then took 2 weeks off. I'll explore pre-allocation when I'm back up to speed. Yep, I use None a lot. Thanks Jim -- http://mail.python.org/mailman/listinfo/python-list
Re: preallocate list
Steven Bethard wrote: Jim wrote: What I really want is a Numeric array but I don't think Numeric supports importing files. Hmmm... Maybe the scipy package? I think scipy.io.read_array might help, but I've never used it. STeVe Sounds promising. I only got Numeric because I wanted scipy but I've hardly explored it as I kept running into problems even with the complicated examples cut and paste into a file ;) Oh yeah, I wanted to explore the GA module but no docs :( and I got busy doing other stuff. Thanks Jim -- http://mail.python.org/mailman/listinfo/python-list
Re: preallocate list
F. Petitjean wrote: Le Wed, 13 Apr 2005 16:46:53 +0100, Jim a écrit : What I really want is a Numeric array but I don't think Numeric supports importing files. Numeric arrays can be serialized from/to files through pickles : import Numeric as N help(N.load) help(N.dump) (and it is space efficient) Jim Yeah thanks. I'm generating them using Matlab though so I'd have to get the format the same. I use Matlab because I get the results I want. When I get to know Python + scipy etc. better I might remove that step. Thanks again Jim -- http://mail.python.org/mailman/listinfo/python-list
Re: preallocate list
Jim wrote: > Hi all > > Is this the best way to preallocate a list of integers? > listName = range(0,length) For serious numerical work you should use Numeric or Numarray, as others suggested. When I do allocate lists the initial values 0:n-1 are rarely what I want, so I use ivec = n*[None] so that if I use a list element before intializing it, for example ivec[0] += 1 I get an error message File "xxnone.py", line 2, in ? ivec[0] += 1 TypeError: unsupported operand type(s) for +=: 'NoneType' and 'int' This is in the same spirit as Python's (welcome) termination of a program when one tries to use an uninitalized scalar variable. -- http://mail.python.org/mailman/listinfo/python-list
Re: preallocate list
Le Wed, 13 Apr 2005 16:46:53 +0100, Jim a écrit : > > What I really want is a Numeric array but I don't think Numeric supports > importing files. Numeric arrays can be serialized from/to files through pickles : import Numeric as N help(N.load) help(N.dump) (and it is space efficient) > > Jim -- http://mail.python.org/mailman/listinfo/python-list
Re: preallocate list
Jim wrote: What I really want is a Numeric array but I don't think Numeric supports importing files. Hmmm... Maybe the scipy package? I think scipy.io.read_array might help, but I've never used it. STeVe -- http://mail.python.org/mailman/listinfo/python-list
Re: preallocate list
Steven Bethard wrote: Jim wrote: .. OK. I've just tried with 4 lines and the code works. With 11000 lines it uses all CPU for at least 30 secs. There must be a better way. Was your test on *just* this function? Or were you doing something with the list produced by this function as well? STeVe Well it's fast enough now. Thanks for having a look. Jim -- http://mail.python.org/mailman/listinfo/python-list
Re: preallocate list
[EMAIL PROTECTED] wrote: what about : factors = [map(float, line.split()) for line in file] should be a hell of a lot faster and nicer. for line in f: factor = [] tokens = line.split() for i in tokens: factor.append(float(i)) factors.append(factor) Is this nasty? Jim Oh the relief :) Of course, line.split() is already a list. Couple of seconds for the 1 line file. Thanks. What I really want is a Numeric array but I don't think Numeric supports importing files. Jim -- http://mail.python.org/mailman/listinfo/python-list
Re: preallocate list
Steven Bethard wrote: Jim wrote: Thanks for the suggestions. I guess I must ensure that this is my bottle neck. def readFactorsIntoList(self,filename,numberLoads): factors = [] f = open(self.basedir + filename,'r') line = f.readline() tokens = line.split() columns = len(tokens) if int(columns) == number: for line in f: factor = [] tokens = line.split() for i in tokens: factor.append(float(i)) factors.append(loadFactor) else: for line in f: tokens = line.split() factors.append([float(tokens[0])] * number) return factors OK. I've just tried with 4 lines and the code works. With 11000 lines it uses all CPU for at least 30 secs. There must be a better way. Was your test on *just* this function? Or were you doing something with the list produced by this function as well? Just this. I had a breakpoint on the return. I'm going to try peufeu's line of code and I'll report back. Jim -- http://mail.python.org/mailman/listinfo/python-list
Re: preallocate list
Jim wrote: Thanks for the suggestions. I guess I must ensure that this is my bottle neck. def readFactorsIntoList(self,filename,numberLoads): factors = [] f = open(self.basedir + filename,'r') line = f.readline() tokens = line.split() columns = len(tokens) if int(columns) == number: for line in f: factor = [] tokens = line.split() for i in tokens: factor.append(float(i)) factors.append(loadFactor) else: for line in f: tokens = line.split() factors.append([float(tokens[0])] * number) return factors OK. I've just tried with 4 lines and the code works. With 11000 lines it uses all CPU for at least 30 secs. There must be a better way. Was your test on *just* this function? Or were you doing something with the list produced by this function as well? STeVe -- http://mail.python.org/mailman/listinfo/python-list
Re: preallocate list
what about : factors = [map(float, line.split()) for line in file] should be a hell of a lot faster and nicer. for line in f: factor = [] tokens = line.split() for i in tokens: factor.append(float(i)) factors.append(factor) Is this nasty? Jim -- http://mail.python.org/mailman/listinfo/python-list
Re: preallocate list
Jim wrote: > Thanks for the suggestions. I guess I must ensure that this is my > bottle neck. ... > for line in f: > factor = [] > tokens = line.split() > for i in tokens: > factor.append(float(i)) > factors.append(loadFactor) > ... You might try: factors = [ [float(item) for item in line.split()] for line in f ] avoiding the extra statements for appending to the lists. Also might try: factors = [ map(float, line.split()) for line in f ] though it uses the out-of-favour functional form for the mapping. Good luck, Mike Mike C. Fletcher Designer, VR Plumber, Coder http://www.vrplumber.com http://blog.vrplumber.com -- http://mail.python.org/mailman/listinfo/python-list
Re: preallocate list
Thanks for the suggestions. I guess I must ensure that this is my bottle neck. def readFactorsIntoList(self,filename,numberLoads): factors = [] f = open(self.basedir + filename,'r') line = f.readline() tokens = line.split() columns = len(tokens) if int(columns) == number: for line in f: factor = [] tokens = line.split() for i in tokens: factor.append(float(i)) factors.append(loadFactor) else: for line in f: tokens = line.split() factors.append([float(tokens[0])] * number) return factors OK. I've just tried with 4 lines and the code works. With 11000 lines it uses all CPU for at least 30 secs. There must be a better way. Jim -- http://mail.python.org/mailman/listinfo/python-list
Re: preallocate list
Just a correction: > I would profile your app to see that it's your append which is taking > ages, but to preallocate a list of strings would look like: > > ["This is an average length string" for i in range(approx_length)] > > My guess is that it won't help to preallocate, but time it and let us > know. A test to back my guess: > > import timeit, math > > def test1(): > lst = [0 for i in range(10)] > for i in xrange(10): > lst[i] = math.sin(i) * i > > def test2(): > lst = [] > for i in xrange(10): > lst.append(math.sin(i) * i) > > t1 = timeit.Timer('test1()', 'from __main__ import test1') > t2 = timeit.Timer('test2()', 'from __main__ import test2') > print "time1: %f" % t1.timeit(100) > print "time2: %f" % t2.timeit(100) > The results change slightly when I actually insert an integer, instead of a float, with lst[i] = i and lst.append(i): 09:14 AM ~$ python test.py time1: 3.352000 time2: 3.672000 The preallocated list is slightly faster in most of my tests, but I still don't think it'll bring a large performance benefit with it unless you're making a truly huge list. I need to wake up before pressing "send". Peace Bill Mill -- http://mail.python.org/mailman/listinfo/python-list
Re: preallocate list
rbt wrote: Jim wrote: If I have a file with a floating point number on each line, what is the best way of reading them into a list (or other ordered structure)? I was iterating with readline and appending to a list but it is taking ages. Perhaps you should use readlines (notice the s) instead of readline. I don't know if I thought of that, but I'm tokenizing each line before adding to a list of lists. for line in f: factor = [] tokens = line.split() for i in tokens: factor.append(float(i)) factors.append(factor) Is this nasty? Jim -- http://mail.python.org/mailman/listinfo/python-list
Re: preallocate list
On 4/13/05, Jim <[EMAIL PROTECTED]> wrote: > Hi all > > Is this the best way to preallocate a list of integers? > listName = range(0,length) > the 0 is unnecessary; range(length) does the same thing. > What about non integers? > arr = [myobject() for i in range(length)] > I've just claimed in the newsgroup above that pre-allocating helps but I > might be getting confused with matlab ;) > > If I have a file with a floating point number on each line, what is the > best way of reading them into a list (or other ordered structure)? > > I was iterating with readline and appending to a list but it is taking ages. > I would profile your app to see that it's your append which is taking ages, but to preallocate a list of strings would look like: ["This is an average length string" for i in range(approx_length)] My guess is that it won't help to preallocate, but time it and let us know. A test to back my guess: import timeit, math def test1(): lst = [0 for i in range(10)] for i in xrange(10): lst[i] = math.sin(i) * i def test2(): lst = [] for i in xrange(10): lst.append(math.sin(i) * i) t1 = timeit.Timer('test1()', 'from __main__ import test1') t2 = timeit.Timer('test2()', 'from __main__ import test2') print "time1: %f" % t1.timeit(100) print "time2: %f" % t2.timeit(100) 09:09 AM ~$ python test.py time1: 12.435000 time2: 12.385000 Peace Bill Mill bill.mill at gmail.com -- http://mail.python.org/mailman/listinfo/python-list
Re: preallocate list
Jim wrote: If I have a file with a floating point number on each line, what is the best way of reading them into a list (or other ordered structure)? I was iterating with readline and appending to a list but it is taking ages. Perhaps you should use readlines (notice the s) instead of readline. -- http://mail.python.org/mailman/listinfo/python-list