Re: Which is faster? (if not b in m) or (if m.count(b) > 0)

2006-02-14 Thread Marc 'BlackJack' Rintsch
In <[EMAIL PROTECTED]>, Farel wrote:

> Which is Faster in Python and Why?

``if not b in m`` looks at each element of `m` until it finds `b` in it
and stops then.  Assuming `b` is in `m`, otherwise all elements of `m` are
"touched".

``if m.count(b) > 0`` will always goes through all elements of `m` in the
`count()` method.

Ciao,
Marc 'BlackJack' Rintsch
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Which is faster? (if not b in m) or (if m.count(b) > 0)

2006-02-15 Thread Felipe Almeida Lessa
Em Ter, 2006-02-14 às 20:14 -0800, Farel escreveu:
> Which is Faster in Python and Why?
> 
> jc = {}; m = []
> x = [ [1,2,3,4,5,6,7,8,9],[..],...] # upwards of 1 entries
> def binm()
> for item in x:
> b = item[:]; b.sort(); bc = 0
> for bitem in b: bc += int(bitem)
> try: m = jc[bc]
> except: m = []
> if not b in m:
> m.append(b); jc[bc]=m

Why do you copy the list and sort it if you're just summing its
elements? Instead of

b = item[:]; b.sort(); bc = 0
for bitem in b: bc += int(bitem)

you can do just

bc = sum(int(i) for i in item)

or, if the order *really* matters, what is very strange, you can do

bc = sum(int(i) for i in sorted(item))

Another thing, if you want to have in the values of the dictionary jc
only unique elements, you can use sets:

>>> a = set()
>>> print a
set([])
>>> a.add(10)
>>> print a
set([10])
>>> a.add(10)
>>> print a
set([10])
>>> a.add(10)
>>> print a
set([10])

So, the final example would be (assuming that you don't need to sum in
order):

def binm():
for item in x:
bc = sum(int(i) for i in item)
try:
jc[bc].add(b)
except KeyError:
jc[bc] = set([b])

Looks prettier, have less statements and is (or should be) faster. This
only works in Python 2.4, but you can adapt it to run on other versions,
too.

Cheers,
Felipe.

-- 
"Quem excele em empregar a força militar subjulga os exércitos dos
outros povos sem travar batalha, toma cidades fortificadas dos outros
povos sem as atacar e destrói os estados dos outros povos sem lutas
prolongadas. Deve lutar sob o Céu com o propósito primordial da
'preservação'. Desse modo suas armas não se embotarão, e os ganhos
poderão ser preservados. Essa é a estratégia para planejar ofensivas."

  -- Sun Tzu, em "A arte da guerra"

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Which is faster? (if not b in m) or (if m.count(b) > 0)

2006-02-15 Thread bruno at modulix
Farel wrote:
> Which is Faster in Python and Why?

Why don't you try by yourself ?
hint :
from timeit import Timer
help(Timer)


-- 
bruno desthuilliers
python -c "print '@'.join(['.'.join([w[::-1] for w in p.split('.')]) for
p in '[EMAIL PROTECTED]'.split('@')])"
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Which is faster? (if not b in m) or (if m.count(b) > 0)

2006-02-15 Thread Steven D'Aprano
On Wed, 15 Feb 2006 08:44:10 +0100, Marc 'BlackJack' Rintsch wrote:

> In <[EMAIL PROTECTED]>, Farel wrote:
> 
>> Which is Faster in Python and Why?
> 
> ``if not b in m`` looks at each element of `m` until it finds `b` in it
> and stops then.  Assuming `b` is in `m`, otherwise all elements of `m` are
> "touched".
> 
> ``if m.count(b) > 0`` will always goes through all elements of `m` in the
> `count()` method.

But the first technique executes in (relatively slow) pure Python, while
the count method executes (relatively fast) C code. So even though count
may do more work, it may do it faster.

The only way to tell for sure is to actually time the code, not try to
guess.



-- 
Steven.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Which is faster? (if not b in m) or (if m.count(b) > 0)

2006-02-15 Thread Georg Brandl
Steven D'Aprano wrote:
> On Wed, 15 Feb 2006 08:44:10 +0100, Marc 'BlackJack' Rintsch wrote:
> 
>> In <[EMAIL PROTECTED]>, Farel wrote:
>> 
>>> Which is Faster in Python and Why?
>> 
>> ``if not b in m`` looks at each element of `m` until it finds `b` in it
>> and stops then.  Assuming `b` is in `m`, otherwise all elements of `m` are
>> "touched".
>> 
>> ``if m.count(b) > 0`` will always goes through all elements of `m` in the
>> `count()` method.
> 
> But the first technique executes in (relatively slow) pure Python, while
> the count method executes (relatively fast) C code. So even though count
> may do more work, it may do it faster.

Why does "not b in m" execute in pure Python? list.__contains__ is implemented
in C code as well as list.count.

Georg
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Which is faster? (if not b in m) or (if m.count(b) > 0)

2006-02-15 Thread Kent Johnson
Steven D'Aprano wrote:
> On Wed, 15 Feb 2006 08:44:10 +0100, Marc 'BlackJack' Rintsch wrote:

>>``if not b in m`` looks at each element of `m` until it finds `b` in it
>>and stops then.  Assuming `b` is in `m`, otherwise all elements of `m` are
>>"touched".
>>
>>``if m.count(b) > 0`` will always goes through all elements of `m` in the
>>`count()` method.
> 
> 
> But the first technique executes in (relatively slow) pure Python, while
> the count method executes (relatively fast) C code. So even though count
> may do more work, it may do it faster.

'a in b' actually takes fewer bytecodes because 'in' has it's own 
bytecode but b.count requires an attribute lookup:

  >>> import dis
  >>> def foo():
  ...   a in b
  ...
  >>> def bar():
  ...   b.count(a)
  ...
   >>> dis.dis(foo)
   2   0 LOAD_GLOBAL  0 (a)
   3 LOAD_GLOBAL  1 (b)
   6 COMPARE_OP   6 (in)
   9 POP_TOP
  10 LOAD_CONST   0 (None)
  13 RETURN_VALUE
  >>> dis.dis(bar)
   2   0 LOAD_GLOBAL  0 (b)
   3 LOAD_ATTR1 (count)
   6 LOAD_GLOBAL  2 (a)
   9 CALL_FUNCTION1
  12 POP_TOP
  13 LOAD_CONST   0 (None)
  16 RETURN_VALUE

Not much difference in speed when the item is not in the list, a slight 
edge to 'a in b' for a large list:

D:\>python -m timeit -s "lst = range(1000)" "1001 in lst"
1 loops, best of 3: 55.5 usec per loop

D:\>python -m timeit -s "lst = range(1000)" "lst.count(1001) > 1"
1 loops, best of 3: 55.5 usec per loop

D:\>python -m timeit -s "lst = range(100)" "lst.count(-1) > 1"
10 loops, best of 3: 62.2 msec per loop

D:\>python -m timeit -s "lst = range(100)" "(-1) in lst"
10 loops, best of 3: 60.8 msec per loop

Of course if the item appears in the list, 'a in b' has a huge advantage:

D:\>python -m timeit -s "lst = range(100)" "(1) in lst"
1000 loops, best of 3: 0.171 usec per loop

Kent
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Which is faster? (if not b in m) or (if m.count(b) > 0)

2006-02-15 Thread Steven D'Aprano
Georg Brandl wrote:

> Steven D'Aprano wrote:
> 
>>On Wed, 15 Feb 2006 08:44:10 +0100, Marc 'BlackJack' Rintsch wrote:
>>
>>
>>>In <[EMAIL PROTECTED]>, Farel wrote:
>>>
>>>
Which is Faster in Python and Why?
>>>
>>>``if not b in m`` looks at each element of `m` until it finds `b` in it
>>>and stops then.  Assuming `b` is in `m`, otherwise all elements of `m` are
>>>"touched".
>>>
>>>``if m.count(b) > 0`` will always goes through all elements of `m` in the
>>>`count()` method.
>>
>>But the first technique executes in (relatively slow) pure Python, while
>>the count method executes (relatively fast) C code. So even though count
>>may do more work, it may do it faster.
> 
> 
> Why does "not b in m" execute in pure Python? list.__contains__ is implemented
> in C code as well as list.count.

Thank you to Kent Johnson for actually *timing* the two 
pieces of code with various sets of data, cutting 
through all the wild guesses (including mine). Which 
was my point.

To answer Georg's question, perhaps I was wrong -- I 
was under the impression that "target in source" 
executes more or less like:

for obj in source:
 if target == obj: return True
return False

Perhaps this was only true in older versions of Python, 
or perhaps I was misinformed, or perhaps I imagined the 
whole thing.

The important lesson is that your intuitions about what 
will be fast are not necessarily correct, and all the 
theorizing in the world is no substitute for good 
testing. (On the other hand, lousy testing is 
practically worthless.)


-- 
Steven.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Which is faster? (if not b in m) or (if m.count(b) > 0)

2006-02-15 Thread Steve Holden
Steven D'Aprano wrote:
[...]
(On the other hand, lousy testing is practically worthless.)

+1 QOTW
-- 
Steve Holden   +44 150 684 7255  +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006  www.python.org/pycon/

-- 
http://mail.python.org/mailman/listinfo/python-list


RE: Which is faster? (if not b in m) or (if m.count(b) > 0)

2006-02-15 Thread Delaney, Timothy (Tim)
Two additional things beyond the important lesson to always time things:

1. Be careful with operator precedence. In this case, people got lucky:

not b in m

has the precedence:

not (b in m)

2. Python has a "not in" operator:

b not in m

which is the clearest and fastest way to test for non-membership in a
container.

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Which is faster? (if not b in m) or (if m.count(b) > 0)

2006-02-17 Thread Farel
Great observations... Changing for lists to tuples would make sorting
unnecessary...

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Which is faster? (if not b in m) or (if m.count(b) > 0)

2006-02-17 Thread Farel
Well, I could, but I was looking for answers from persons more
experienced in the ways of Python then myself

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Which is faster? (if not b in m) or (if m.count(b) > 0)

2006-02-17 Thread Farel
Tim, Are you saying that:
 not (b in m)
is faster than:
b not in m


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Which is faster? (if not b in m) or (if m.count(b) > 0)

2006-02-17 Thread Farel
Execellent, Kent. Thank you... and everyone too! I've learned much!

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Which is faster? (if not b in m) or (if m.count(b) > 0)

2006-02-17 Thread Peter Hansen
Farel wrote:
> Tim, Are you saying that:
>  not (b in m)
> is faster than:
> b not in m
> 

As Bruno tried to tell you, and I quote:

"""
Why don't you try by yourself ?
hint :
from timeit import Timer
help(Timer)
"""

Asking "which is faster" in comp.lang.python is the least efficient way 
of getting an answer, for everyone who reads the newsgroup, everyone who 
tries to help and, most importantly, for yourself.

-Peter

-- 
http://mail.python.org/mailman/listinfo/python-list


RE: Which is faster? (if not b in m) or (if m.count(b) > 0)

2006-02-19 Thread Delaney, Timothy (Tim)
Farel wrote:

> Tim, Are you saying that:
>  not (b in m)
> is faster than:
> b not in m

On the contrary. `not (b in m)` requires negating the result of `b in m`
(via an additional bytecode operation). `b not in m` doesn't. However,
the difference in performance is minimal, as testing using the timeit
module will show you.

The important difference is the improvement in clarity. There is no
possibility for misinterpretation as to the meaning of `b not in m`,
whereas with the original `not b in m` I had to actually go check the
precedence rules to be sure what would happen. Adding the parentheses
makes it clear, but doesn't read as well as using the `not in` operator.

As others have said, if you really care about the performance of
something, the timeit module is your friend. Discussions about *why* you
get certain performance results are then useful.

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


RE: Which is faster? (if not b in m) or (if m.count(b) > 0)

2006-02-19 Thread Jean-Paul Calderone
On Mon, 20 Feb 2006 10:08:48 +1100, "Delaney, Timothy \(Tim\)" <[EMAIL 
PROTECTED]> wrote:
>Farel wrote:
>
>> Tim, Are you saying that:
>>  not (b in m)
>> is faster than:
>> b not in m
>
>On the contrary. `not (b in m)` requires negating the result of `b in m`
>(via an additional bytecode operation). `b not in m` doesn't. However,
>the difference in performance is minimal, as testing using the timeit
>module will show you.

Not since Python 2.4:

>>> dis.dis(lambda: x not in y)
  1   0 LOAD_GLOBAL  0 (x)
  3 LOAD_GLOBAL  1 (y)
  6 COMPARE_OP   7 (not in)
  9 RETURN_VALUE
>>> dis.dis(lambda: not (x in y))
  1   0 LOAD_GLOBAL  0 (x)
  3 LOAD_GLOBAL  1 (y)
  6 COMPARE_OP   7 (not in)
  9 RETURN_VALUE
>>> 

>
>The important difference is the improvement in clarity. There is no
>possibility for misinterpretation as to the meaning of `b not in m`,
>whereas with the original `not b in m` I had to actually go check the
>precedence rules to be sure what would happen. Adding the parentheses
>makes it clear, but doesn't read as well as using the `not in` operator.
>
>As others have said, if you really care about the performance of
>something, the timeit module is your friend. Discussions about *why* you
>get certain performance results are then useful.

But this stuff is all still true :)

Jean-Paul
-- 
http://mail.python.org/mailman/listinfo/python-list


RE: Which is faster? (if not b in m) or (if m.count(b) > 0)

2006-02-19 Thread Delaney, Timothy (Tim)
Jean-Paul Calderone wrote:

> Not since Python 2.4:
> 
> >>> dis.dis(lambda: x not in y)
>   1   0 LOAD_GLOBAL  0 (x)
>   3 LOAD_GLOBAL  1 (y)
>   6 COMPARE_OP   7 (not in)
>   9 RETURN_VALUE
> >>> dis.dis(lambda: not (x in y))
>   1   0 LOAD_GLOBAL  0 (x)
>   3 LOAD_GLOBAL  1 (y)
>   6 COMPARE_OP   7 (not in)
>   9 RETURN_VALUE
> >>>

Damn - I missed that change. Peephole optimiser I guess.

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Which is faster? (if not b in m) or (if m.count(b) > 0)

2006-02-22 Thread Farel
Thank you, Peter!  Please understand, I was attempting to get more info
on the WHY x is faster than y...  from those with more experience.
Timer cannot give me that info.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Which is faster? (if not b in m) or (if m.count(b) > 0)

2006-02-23 Thread Peter Hansen
Farel wrote:
> Thank you, Peter!  Please understand, I was attempting to get more info
> on the WHY x is faster than y...  from those with more experience.
> Timer cannot give me that info.

Thanks for clarifying.  For future reference, it's traditional in 
English to use the word "why" in a question when you're asking WHY, 
rather than, say, phrasing the question like this:

Farel wrote:

 >> Tim, Are you saying that:
 >>  not (b in m)
 >> is faster than:
 >> b not in m
 >> 


-Peter

-- 
http://mail.python.org/mailman/listinfo/python-list