Re: time consuming loops over lists

Diez B. Roggisch Tue, 07 Jun 2005 09:15:38 -0700

[EMAIL PROTECTED] wrote:

X-No-Archive: yes
Can some one help me improve this block of code...this jus converts the
list of data into tokens based on the range it falls into...but it
takes a long time.Can someone tell me what can i change to improve
it...

               if data[i] in xrange(rngs[j],rngs[j+1]):

That's a bummer: You create a list and then search linearily in in -where all you want to know is


if rngs[j] <= data[i] < rngs[j+1]

Attached is a script that does contain your old and my enhanced version- and shows that the results are equal. Running only your version takes~35s, where mine uses ~1s!!!

Another optimization im too lazy now would be to do sort of a "treesearch" of data[i] in rngs - as the ranges are ordered, you could findthe proper one in log_2(len(rngs)) instead of len(rngs)/2.

Additional improvements can be achieved when data is sorted - but thatdepends on your application and actually sizes of data.


Diez

from math import *
from Numeric import *
from random import *
def Tkz2(tk,data):
     no_of_bins = 10
     tkns = []
     dmax = max(data)+1
     dmin = min(data)
     rng = ceil(abs((dmax - dmin)/(no_of_bins*1.0)))
     rngs = zeros(no_of_bins+1)
     for i in xrange(no_of_bins+1):
          rngs[i] = dmin + (rng*i)
     for i in xrange(len(data)):
          for j in xrange(len(rngs)-1):
               if rngs[j] <= data[i] < rngs[j+1]:
                    tkns.append( str(tk)+str(j) )
     return tkns

def Tkz(tk,data):
     no_of_bins = 10
     tkns = []
     dmax = max(data)+1
     dmin = min(data)
     rng = ceil(abs((dmax - dmin)/(no_of_bins*1.0)))
     rngs = zeros(no_of_bins+1)
     for i in xrange(no_of_bins+1):
          rngs[i] = dmin + (rng*i)
     for i in xrange(len(data)):
          for j in xrange(len(rngs)-1):
               if data[i] in xrange(rngs[j], rngs[j+1]):
                    tkns.append( str(tk)+str(j) )
     return tkns


data = range(20,12312)
shuffle(data)
res1 = Tkz('A', data)
res2 = Tkz2('A', data)

print res1 == res2

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: time consuming loops over lists

Reply via email to