def list2index(L):
idx=dict((y,x) for x,y in enumerate(set(L)))
return asmatrix(fromiter((idx[x] for x in L),dtype=int))
# old
$ python test.py
Numbers: 29.4062280655 seconds
Characters: 84.6239070892 seconds
Dates: 117.560418844 seconds
# new
$ python test.py
Numbers: 1.79700994492 seconds
Characters: 1.6025249958 seconds
Dates: 1.7974088192 seconds
16, 52 and 100 times faster
//Torgil
On 8/29/06, Keith Goodman <[EMAIL PROTECTED]> wrote:
> I have a very long list that contains many repeated elements. The
> elements of the list can be either all numbers, or all strings, or all
> dates [datetime.date].
>
> I want to convert the list into a matrix where each unique element of
> the list is assigned a consecutive integer starting from zero.
>
> I've done it by brute force below. Any tips for making it faster? (5x
> would make it useful; 10x would be a dream.)
>
> >> list2index.test()
> Numbers: 5.84955787659 seconds
> Characters: 24.3192870617 seconds
> Dates: 39.288228035 seconds
>
>
> import datetime, time
> from numpy import nan, asmatrix, ones
>
> def list2index(L):
>
> # Find unique elements in list
> uL = dict.fromkeys(L).keys()
>
> # Convert list to matrix
> L = asmatrix(L).T
>
> # Initialize return matrix
> idx = nan * ones((L.size, 1))
>
> # Assign numbers to unique L values
> for i, uLi in enumerate(uL):
> idx[L == uLi,:] = i
>
> def test():
>
> L = 5000*range(255)
> t1 = time.time()
> idx = list2index(L)
> t2 = time.time()
> print 'Numbers:', t2-t1, 'seconds'
>
> L = 5000*[chr(z) for z in range(255)]
> t1 = time.time()
> idx = list2index(L)
> t2 = time.time()
> print 'Characters:', t2-t1, 'seconds'
>
> d = datetime.date
> step = datetime.timedelta
> L = 5000*[d(2006,1,1)+step(z) for z in range(255)]
> t1 = time.time()
> idx = list2index(L)
> t2 = time.time()
> print 'Dates:', t2-t1, 'seconds'
>
> -------------------------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Numpy-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/numpy-discussion