Re: [Tutor] A faster x in S

2008-01-16 Thread Dinesh B Vadhia
I used the s.intersection(t) function in the set type as it was the most 
appropriate.  The performance was phenomenal.  Thank-you!

Dinesh


- Original Message - 
From: bob gailer 
To: Dinesh B Vadhia 
Cc: tutor@python.org 
Sent: Tuesday, January 15, 2008 2:03 PM
Subject: Re: [Tutor] A faster x in S


Dinesh B Vadhia wrote:
 For some significant data pre-processing we have to perform the 
 following simple process:
  
 Is the integer x in a list of 13K sorted integers.  That's it except 
 this has to be done 100m times with different x's (multiple times).  
 Yep, a real pain! 
  
 I've put the 13K integers in a list S and am using the is 'x in S' 
 function.
  
 I was wondering if there is anything faster?
I agree with Kent.

  l = range(13000)
  s=set(l)
  d=dict(enumerate(l))
  import time
  def f(lookupVal, times, values):
.. st=time.time()
.. for i in range(times):
.. z = lookupVal in values
.. return time.time()-st   
  f(6499,1000,l)
0.3126376037598
  f(6499,100,s)
0.3123623962402

So set is 1000 times faster than list!

  f(6499,100,d)
0.31300020217895508

And dict is (as expected) about the same as set.

So 100,000,000 lookups should take about 30 seconds. Not bad, eh?

Let's explore another angle. What range are the integers in (min and max)?

Bob
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] A faster x in S

2008-01-15 Thread Dinesh B Vadhia
For some significant data pre-processing we have to perform the following 
simple process:

Is the integer x in a list of 13K sorted integers.  That's it except this has 
to be done 100m times with different x's (multiple times).  Yep, a real pain!  

I've put the 13K integers in a list S and am using the is 'x in S' function.

I was wondering if there is anything faster?

Dinesh
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] A faster x in S

2008-01-15 Thread Kent Johnson
Dinesh B Vadhia wrote:
 For some significant data pre-processing we have to perform the 
 following simple process:
  
 Is the integer x in a list of 13K sorted integers.  That's it except 
 this has to be done 100m times with different x's (multiple times).  
 Yep, a real pain! 
  
 I've put the 13K integers in a list S and am using the is 'x in S' function.
  
 I was wondering if there is anything faster?

Yes. Put the integers in a set and test for membership there. If for 
some reason the integers have to be in a list, and the list is sorted, 
use the bisect module to do a binary search, rather than the linear 
search used by 'x in S'.
http://docs.python.org/lib/module-bisect.html

Kent
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] A faster x in S

2008-01-15 Thread bob gailer
Dinesh B Vadhia wrote:
 For some significant data pre-processing we have to perform the 
 following simple process:
  
 Is the integer x in a list of 13K sorted integers.  That's it except 
 this has to be done 100m times with different x's (multiple times).  
 Yep, a real pain! 
  
 I've put the 13K integers in a list S and am using the is 'x in S' 
 function.
  
 I was wondering if there is anything faster?
I agree with Kent.

  l = range(13000)
  s=set(l)
  d=dict(enumerate(l))
  import time
  def f(lookupVal, times, values):
... st=time.time()
... for i in range(times):
... z = lookupVal in values
... return time.time()-st   
  f(6499,1000,l)
0.3126376037598
  f(6499,100,s)
0.3123623962402

So set is 1000 times faster than list!

  f(6499,100,d)
0.31300020217895508

And dict is (as expected) about the same as set.

So 100,000,000 lookups should take about 30 seconds. Not bad, eh?

Let's explore another angle. What range are the integers in (min and max)?

Bob
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] A faster x in S

2008-01-15 Thread Danny Yoo
Kent and Bob,

Are you thinking of the first problem in Bentley's Programming Pearls? 
The original poster's questions sounds like it could be in that domain.

 http://netlib.bell-labs.com/cm/cs/pearls/cto.html

So I agree: the next questions we probably should ask the original poster:

* Why are you trying to search for a number in those sorted integers?


* Is there anything characteristic about those sorted integers that
  might be peculiar or useful?  Do the numbers have streaks?  Are the
  integers large or small?
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor