def list2index(L): idx=dict((y,x) for x,y in enumerate(set(L))) return asmatrix(fromiter((idx[x] for x in L),dtype=int))
# old $ python test.py Numbers: 29.4062280655 seconds Characters: 84.6239070892 seconds Dates: 117.560418844 seconds # new $ python test.py Numbers: 1.79700994492 seconds Characters: 1.6025249958 seconds Dates: 1.7974088192 seconds 16, 52 and 100 times faster //Torgil On 8/29/06, Keith Goodman <[EMAIL PROTECTED]> wrote: > I have a very long list that contains many repeated elements. The > elements of the list can be either all numbers, or all strings, or all > dates [datetime.date]. > > I want to convert the list into a matrix where each unique element of > the list is assigned a consecutive integer starting from zero. > > I've done it by brute force below. Any tips for making it faster? (5x > would make it useful; 10x would be a dream.) > > >> list2index.test() > Numbers: 5.84955787659 seconds > Characters: 24.3192870617 seconds > Dates: 39.288228035 seconds > > > import datetime, time > from numpy import nan, asmatrix, ones > > def list2index(L): > > # Find unique elements in list > uL = dict.fromkeys(L).keys() > > # Convert list to matrix > L = asmatrix(L).T > > # Initialize return matrix > idx = nan * ones((L.size, 1)) > > # Assign numbers to unique L values > for i, uLi in enumerate(uL): > idx[L == uLi,:] = i > > def test(): > > L = 5000*range(255) > t1 = time.time() > idx = list2index(L) > t2 = time.time() > print 'Numbers:', t2-t1, 'seconds' > > L = 5000*[chr(z) for z in range(255)] > t1 = time.time() > idx = list2index(L) > t2 = time.time() > print 'Characters:', t2-t1, 'seconds' > > d = datetime.date > step = datetime.timedelta > L = 5000*[d(2006,1,1)+step(z) for z in range(255)] > t1 = time.time() > idx = list2index(L) > t2 = time.time() > print 'Dates:', t2-t1, 'seconds' > > ------------------------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion