Cameron Simpson wrote: > On 29May2017 01:17, Alan Gauld <alan.ga...@yahoo.co.uk> wrote: >>On 29/05/17 00:12, Alex Kleider wrote: >>> Would >>>>>> if Month in {'January', '1'}: >>> >>> be even better? (regarding efficiency perhaps? Trivial point, I know, >>> but just wondering.) >> >>If in doubt try it out and profile/time it. >> >>But I don't think it will make much difference since ultimately >>it still has to test each value (although a hashing algorithm >>may be involved that works on partial matches...) But if in >>doubt... > > Hi Alex, > > As written it should be a bit slower: to construct a set each member get > tested for presence. The cost is in making the set, not in searching it.
No, CPython is a bit smarter than that: >>> dis.dis('if m in {"1", "January"}: pass') 1 0 LOAD_NAME 0 (m) 2 LOAD_CONST 3 (frozenset({'1', 'January'})) 4 COMPARE_OP 6 (in) 6 POP_JUMP_IF_FALSE 8 >> 8 LOAD_CONST 2 (None) 10 RETURN_VALUE However, there seems to be limit: >>> def check(n): ... f = io.StringIO() ... dis.dis("if m in {%s}: pass" % ",".join(map(str, range(n))), file=f) ... return f.getvalue().count("LOAD_CONST") ... >>> check(123) 2 >>> check(124) 125 This is python 3.6; for 3.5 and 3.4 I found a maximum length of 80 using the same method. Further tests indicate that you cannot have two sets of length 80. > _However_, supposing your program were doing this a lot. You might well > have a global (or, better, long lived shared object) containing a set that > has already been constructed. Then: > > if Month in the_set: > > is very fast; constant time. Overall, using globals may still be a good idea. Even if there's often no direct performance reward (looking up a global is even a tad slower) you gain predictability. > Whereas as you would expect, checking a list > is linear with the size of the list. > > So, using a list: > > seen = [] > for item in some_item_generator(): > if item in seen: > continue > seen.append(item) > ... do stuff with item, which is new ... > > The cost of the "if" goes up linearly as you add more items. > > Using a set: > > seen = {} > for item in some_item_generator(): > if item in seen: > continue > seen.add(item) > ... do stuff with item, which is new ... > > The second version will be much more effiient as the "seen" set grows; the > lookup time on the set is essentially O(1) (constant time). > > But for an ad hoc 2 element list as in your original example the > difference will be pretty small; making the 2 element set _should_ be > slightly more expensive, and isn't the common idiom (==> less readable). > Personally I use: > > if value in ('a', 'b', 'c'): > > BTW, in Python we tend to use named like "Fred" for classes (or > factories), and "fred" for regular variables. And "FRED" for things that > would be constants in other languages. Eg: > > MAX_THINGS = 16 > > class Foo: > .... > > def FooBah(x): > return Foo(x, style="bah") > > for fred in ....: > > Cheers, > Cameron Simpson <c...@zip.com.au> > _______________________________________________ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor