Re: preallocate list

2005-04-14 Thread Jim
John Machin wrote:
On Wed, 13 Apr 2005 14:28:51 +0100, Jim [EMAIL PROTECTED]
wrote:

Thanks for the suggestions. I guess I must ensure that this is my bottle 
neck.
code
   def readFactorsIntoList(self,filename,numberLoads):

1. numberLoads is not used. 


factors = []
   f = open(self.basedir + filename,'r')
   line = f.readline()
   tokens = line.split()
   columns = len(tokens)
   if int(columns) == number:

2. columns is already an int (unless of course you've redefined
len!). Doing int(columns) is pointless.
3. What is number? Same as numberLoads?
4. Please explain in general what is the layout of your file and in
particular, what is the significance of the first line of the file and
of the above if test. 


   for line in f:
   factor = []
   tokens = line.split()
   for i in tokens:
   factor.append(float(i))

4. factor is built and then not used any more??

   factors.append(loadFactor)

5. What is loadFactor? Same as factor?

   else:
   for line  in f:
   tokens = line.split()
   factors.append([float(tokens[0])] * number)

6. You throw away any tokens in the line after the first??

return factors
/code
OK. I've just tried with 4 lines and the code works.

Which code works? The code you posted? Please define works.

With 11000 lines it 
uses all CPU for at least 30 secs. There must be a better way.

Perhaps after you post the code that you've actually run, and
explained what your file layout is, and what you are trying to
achieve, then we can give you some meaningful help.
Cheers,
John

Thanks for looking John. For that I should take a little time to explain.
I tried to rename the variables, some of them were four words long. I 
got a couple of the renames wrong. Sorry.

Regarding 'works'. I meant that with a text file of four lines the code 
completed. With my desired size 11000 lines it didn't complete within 
the limits of my patience. I didn't try any other size.

Also I perhaps wrongly use the newsgroup threads paradigm in trying to 
restart my query with extra information (that turned out a little faulty).

Luckily the other branches yielded fruit.
Thanks again
Jim
--
http://mail.python.org/mailman/listinfo/python-list


preallocate list

2005-04-13 Thread Jim
Hi all
Is this the best way to preallocate a list of integers?
listName = range(0,length)
What about non integers?
I've just claimed in the newsgroup above that pre-allocating helps but I 
might be getting confused with matlab ;)

If I have a file with a floating point number on each line, what is the 
best way of reading them into a list (or other ordered structure)?

I was iterating with readline and appending to a list but it is taking ages.
Jim
--
http://mail.python.org/mailman/listinfo/python-list


Re: preallocate list

2005-04-13 Thread rbt
Jim wrote:
If I have a file with a floating point number on each line, what is the 
best way of reading them into a list (or other ordered structure)?

I was iterating with readline and appending to a list but it is taking 
ages.
Perhaps you should use readlines (notice the s) instead of readline.
--
http://mail.python.org/mailman/listinfo/python-list


Re: preallocate list

2005-04-13 Thread Bill Mill
On 4/13/05, Jim [EMAIL PROTECTED] wrote:
 Hi all
 
 Is this the best way to preallocate a list of integers?
 listName = range(0,length)
 

the 0 is unnecessary; range(length) does the same thing.

 What about non integers?
 

arr = [myobject() for i in range(length)]

 I've just claimed in the newsgroup above that pre-allocating helps but I
 might be getting confused with matlab ;)
 
 If I have a file with a floating point number on each line, what is the
 best way of reading them into a list (or other ordered structure)?
 
 I was iterating with readline and appending to a list but it is taking ages.
 

I would profile your app to see that it's your append which is taking
ages, but to preallocate a list of strings would look like:

[This is an average length string for i in range(approx_length)]

My guess is that it won't help to preallocate, but time it and let us
know. A test to back my guess:

import timeit, math

def test1():
lst = [0 for i in range(10)]
for i in xrange(10):
lst[i] = math.sin(i) * i

def test2():
lst = []
for i in xrange(10):
lst.append(math.sin(i) * i)

t1 = timeit.Timer('test1()', 'from __main__ import test1')
t2 = timeit.Timer('test2()', 'from __main__ import test2')
print time1: %f % t1.timeit(100)
print time2: %f % t2.timeit(100)

09:09 AM ~$ python test.py
time1: 12.435000
time2: 12.385000

Peace
Bill Mill
bill.mill at gmail.com
--
http://mail.python.org/mailman/listinfo/python-list


Re: preallocate list

2005-04-13 Thread Jim
rbt wrote:
Jim wrote:
If I have a file with a floating point number on each line, what is 
the best way of reading them into a list (or other ordered structure)?

I was iterating with readline and appending to a list but it is taking 
ages.

Perhaps you should use readlines (notice the s) instead of readline.
I don't know if I thought of that, but I'm tokenizing each line before 
adding to a list of lists.

for line in f:
factor = []
tokens = line.split()
for i in tokens:
factor.append(float(i))
factors.append(factor)
Is this nasty?
Jim
--
http://mail.python.org/mailman/listinfo/python-list


Re: preallocate list

2005-04-13 Thread Bill Mill
Just a correction:

snip
 I would profile your app to see that it's your append which is taking
 ages, but to preallocate a list of strings would look like:
 
 [This is an average length string for i in range(approx_length)]
 
 My guess is that it won't help to preallocate, but time it and let us
 know. A test to back my guess:
 
 import timeit, math
 
 def test1():
 lst = [0 for i in range(10)]
 for i in xrange(10):
 lst[i] = math.sin(i) * i
 
 def test2():
 lst = []
 for i in xrange(10):
 lst.append(math.sin(i) * i)
 
 t1 = timeit.Timer('test1()', 'from __main__ import test1')
 t2 = timeit.Timer('test2()', 'from __main__ import test2')
 print time1: %f % t1.timeit(100)
 print time2: %f % t2.timeit(100)
 

The results change slightly when I actually insert an integer, instead
of a float, with lst[i] = i and lst.append(i):

09:14 AM ~$ python test.py
time1: 3.352000
time2: 3.672000

The preallocated list is slightly faster in most of my tests, but I
still don't think it'll bring a large performance benefit with it
unless you're making a truly huge list.

I need to wake up before pressing send.

Peace
Bill Mill
--
http://mail.python.org/mailman/listinfo/python-list


Re: preallocate list

2005-04-13 Thread Jim
Thanks for the suggestions. I guess I must ensure that this is my bottle 
neck.
code
def readFactorsIntoList(self,filename,numberLoads):
	factors = []
f = open(self.basedir + filename,'r')
line = f.readline()
tokens = line.split()
columns = len(tokens)
if int(columns) == number:
for line in f:
factor = []
tokens = line.split()
for i in tokens:
factor.append(float(i))
factors.append(loadFactor)
else:
for line  in f:
tokens = line.split()
factors.append([float(tokens[0])] * number)
	return factors
/code

OK. I've just tried with 4 lines and the code works. With 11000 lines it 
uses all CPU for at least 30 secs. There must be a better way.

Jim
--
http://mail.python.org/mailman/listinfo/python-list


Re: preallocate list

2005-04-13 Thread Mike C. Fletcher
Jim wrote:

 Thanks for the suggestions. I guess I must ensure that this is my
 bottle neck.

...

 for line in f:
 factor = []
 tokens = line.split()
 for i in tokens:
 factor.append(float(i))
 factors.append(loadFactor)

...

You might try:

factors = [ [float(item) for item in line.split()] for line in f ]

avoiding the extra statements for appending to the lists.  Also might try:

factors = [ map(float, line.split()) for line in f ]

though it uses the out-of-favour functional form for the mapping.

Good luck,
Mike


  Mike C. Fletcher
  Designer, VR Plumber, Coder
  http://www.vrplumber.com
  http://blog.vrplumber.com

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: preallocate list

2005-04-13 Thread peufeu
what about :
factors = [map(float, line.split()) for line in file]
should be a hell of a lot faster and nicer.
 for line in f:
 factor = []
 tokens = line.split()
 for i in tokens:
 factor.append(float(i))
 factors.append(factor)
Is this nasty?
Jim
--
http://mail.python.org/mailman/listinfo/python-list


Re: preallocate list

2005-04-13 Thread Steven Bethard
Jim wrote:
Thanks for the suggestions. I guess I must ensure that this is my bottle 
neck.
code
def readFactorsIntoList(self,filename,numberLoads):
factors = []
f = open(self.basedir + filename,'r')
line = f.readline()
tokens = line.split()
columns = len(tokens)
if int(columns) == number:
for line in f:
factor = []
tokens = line.split()
for i in tokens:
factor.append(float(i))
factors.append(loadFactor)
else:
for line  in f:
tokens = line.split()
factors.append([float(tokens[0])] * number)
return factors
/code

OK. I've just tried with 4 lines and the code works. With 11000 lines it 
uses all CPU for at least 30 secs. There must be a better way.
Was your test on *just* this function?  Or were you doing something with 
the list produced by this function as well?

STeVe
--
http://mail.python.org/mailman/listinfo/python-list


Re: preallocate list

2005-04-13 Thread Jim
Steven Bethard wrote:
Jim wrote:
Thanks for the suggestions. I guess I must ensure that this is my 
bottle neck.
code
def readFactorsIntoList(self,filename,numberLoads):
factors = []
f = open(self.basedir + filename,'r')
line = f.readline()
tokens = line.split()
columns = len(tokens)
if int(columns) == number:
for line in f:
factor = []
tokens = line.split()
for i in tokens:
factor.append(float(i))
factors.append(loadFactor)
else:
for line  in f:
tokens = line.split()
factors.append([float(tokens[0])] * number)
return factors
/code

OK. I've just tried with 4 lines and the code works. With 11000 lines 
it uses all CPU for at least 30 secs. There must be a better way.

Was your test on *just* this function?  Or were you doing something with 
the list produced by this function as well?

Just this. I had a breakpoint on the return.
I'm going to try peufeu's line of code and I'll report back.
Jim
--
http://mail.python.org/mailman/listinfo/python-list


Re: preallocate list

2005-04-13 Thread Jim
[EMAIL PROTECTED] wrote:
what about :
factors = [map(float, line.split()) for line in file]
should be a hell of a lot faster and nicer.
 for line in f:
 factor = []
 tokens = line.split()
 for i in tokens:
 factor.append(float(i))
 factors.append(factor)
Is this nasty?
Jim

Oh the relief :)
Of course, line.split() is already a list.
Couple of seconds for the 1 line file.
Thanks.
What I really want is a Numeric array but I don't think Numeric supports 
importing files.

Jim
--
http://mail.python.org/mailman/listinfo/python-list


Re: preallocate list

2005-04-13 Thread Jim
Steven Bethard wrote:
Jim wrote:
..
OK. I've just tried with 4 lines and the code works. With 11000 lines 
it uses all CPU for at least 30 secs. There must be a better way.

Was your test on *just* this function?  Or were you doing something with 
the list produced by this function as well?

STeVe
Well it's fast enough now. Thanks for having a look.
Jim
--
http://mail.python.org/mailman/listinfo/python-list


Re: preallocate list

2005-04-13 Thread Steven Bethard
Jim wrote:
What I really want is a Numeric array but I don't think Numeric supports 
importing files.
Hmmm...  Maybe the scipy package?
I think scipy.io.read_array might help, but I've never used it.
STeVe
--
http://mail.python.org/mailman/listinfo/python-list


Re: preallocate list

2005-04-13 Thread F. Petitjean
Le Wed, 13 Apr 2005 16:46:53 +0100, Jim a écrit :
 
 What I really want is a Numeric array but I don't think Numeric supports 
 importing files.
Numeric arrays can be serialized from/to files through pickles :
import Numeric as N
help(N.load)
help(N.dump)
(and it is space efficient)
 
 Jim
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: preallocate list

2005-04-13 Thread beliavsky
Jim wrote:
 Hi all

 Is this the best way to preallocate a list of integers?
 listName = range(0,length)

For serious numerical work you should use Numeric or Numarray, as
others suggested. When I do allocate lists the initial values 0:n-1 are
rarely what I want, so I use

ivec = n*[None]

so that if I use a list element before intializing it, for example

ivec[0] += 1

I get an error message

  File xxnone.py, line 2, in ?
ivec[0] += 1
TypeError: unsupported operand type(s) for +=: 'NoneType' and 'int'

This is in the same spirit as Python's (welcome) termination of a
program when one tries to use an uninitalized scalar variable.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: preallocate list

2005-04-13 Thread Jim
F. Petitjean wrote:
Le Wed, 13 Apr 2005 16:46:53 +0100, Jim a écrit :
What I really want is a Numeric array but I don't think Numeric supports 
importing files.
Numeric arrays can be serialized from/to files through pickles :
import Numeric as N
help(N.load)
help(N.dump)
(and it is space efficient)
Jim
Yeah thanks. I'm generating them using Matlab though so I'd have to get 
the format the same. I use Matlab because I get the results I want. When 
I get to know Python + scipy etc. better I might remove that step.

Thanks again
Jim
--
http://mail.python.org/mailman/listinfo/python-list


Re: preallocate list

2005-04-13 Thread Jim
Steven Bethard wrote:
Jim wrote:
What I really want is a Numeric array but I don't think Numeric 
supports importing files.

Hmmm...  Maybe the scipy package?
I think scipy.io.read_array might help, but I've never used it.
STeVe
Sounds promising.
I only got Numeric because I wanted scipy but I've hardly explored it as 
I kept running into problems even with the complicated examples cut and 
paste into a file ;)

Oh yeah, I wanted to explore the GA module but no docs :( and I got busy 
doing other stuff.

Thanks
Jim
--
http://mail.python.org/mailman/listinfo/python-list


Re: preallocate list

2005-04-13 Thread Jim

ivec = n*[None]
so that if I use a list element before intializing it, for example
ivec[0] += 1
I get an error message
  File xxnone.py, line 2, in ?
ivec[0] += 1
TypeError: unsupported operand type(s) for +=: 'NoneType' and 'int'
This is in the same spirit as Python's (welcome) termination of a
program when one tries to use an uninitalized scalar variable.
I feel foolish that I forgot about *. I've just started with Python then 
took 2 weeks off. I'll explore pre-allocation when I'm back up to speed.

Yep, I use None a lot.
Thanks
Jim
--
http://mail.python.org/mailman/listinfo/python-list


Re: preallocate list

2005-04-13 Thread Dan Christensen
Bill Mill [EMAIL PROTECTED] writes:

 Bill Mill [EMAIL PROTECTED] writes:

 I would profile your app to see that it's your append which is taking
 ages, but to preallocate a list of strings would look like:
 
 [This is an average length string for i in range(approx_length)]

I don't think there's any point putting strings into the preallocated
list.  A list is just an array of pointers to objects, so any object
will do fine for preallocation, no matter what the list will be used for.

 My guess is that it won't help to preallocate, but time it and let us
 know. A test to back my guess:
 
 import timeit, math
 
 def test1():
 lst = [0 for i in range(10)]
 for i in xrange(10):
 lst[i] = math.sin(i) * i
 
 def test2():
 lst = []
 for i in xrange(10):
 lst.append(math.sin(i) * i)

...

 The results change slightly when I actually insert an integer, instead
 of a float, with lst[i] = i and lst.append(i):

 09:14 AM ~$ python test.py
 time1: 3.352000
 time2: 3.672000

If you use

  lst = range(10)

or even better

  lst = [None]*10

then test1 is more than twice as fast as test2:

time1: 2.437730
time2: 5.308054

(using python 2.4).

Your code

 lst = [0 for i in range(10)]

made python do an extra 10-iteration loop.

Dan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: preallocate list

2005-04-13 Thread John Machin
On Wed, 13 Apr 2005 14:28:51 +0100, Jim [EMAIL PROTECTED]
wrote:

Thanks for the suggestions. I guess I must ensure that this is my bottle 
neck.
code
 def readFactorsIntoList(self,filename,numberLoads):

1. numberLoads is not used. 

   factors = []
 f = open(self.basedir + filename,'r')
 line = f.readline()
 tokens = line.split()
 columns = len(tokens)
 if int(columns) == number:

2. columns is already an int (unless of course you've redefined
len!). Doing int(columns) is pointless.
3. What is number? Same as numberLoads?
4. Please explain in general what is the layout of your file and in
particular, what is the significance of the first line of the file and
of the above if test. 

 for line in f:
 factor = []
 tokens = line.split()
 for i in tokens:
 factor.append(float(i))

4. factor is built and then not used any more??

 factors.append(loadFactor)

5. What is loadFactor? Same as factor?

 else:
 for line  in f:
 tokens = line.split()
 factors.append([float(tokens[0])] * number)

6. You throw away any tokens in the line after the first??

   return factors
/code

OK. I've just tried with 4 lines and the code works.

Which code works? The code you posted? Please define works.


 With 11000 lines it 
uses all CPU for at least 30 secs. There must be a better way.

Perhaps after you post the code that you've actually run, and
explained what your file layout is, and what you are trying to
achieve, then we can give you some meaningful help.

Cheers,

John



-- 
http://mail.python.org/mailman/listinfo/python-list