Re: preferring [] or () in list of error codes?

2009-06-19 Thread Ben Finney
Albert van der Horst  writes:

> But I greatly prefer a set
> 
> "
> for i in {point1,point2,point3}:
> statements
> "

Agreed, for the reasons you cite. I think this idiom can be expected to
become more common and hopefully displace using a tuple literal or list
literal, as the set literal syntax becomes more reliably available on
arbitrary installed Python versions.

> [Yes I know { } doesn't denote a set. I tried it. I don't know how to
> denote a set ... ]

Try it in Python 3 and be prepared to be pleased
http://docs.python.org/3.0/whatsnew/3.0.html#new-syntax>.

-- 
 \   “Too many Indians spoil the golden egg.” —Sir Joh |
  `\   Bjelke-Petersen |
_o__)  |
Ben Finney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-19 Thread Albert van der Horst
In article ,
>
>
>But practicality beats purity -- there are many scenarios where we make
>compromises in our meaning in order to get correct, efficient code. E.g.
>we use floats, despite them being a poor substitute for the abstract Real
>numbers we mean.
>
>In addition, using a tuple or a list in this context:
>
>if e.message.code in (25401,25402,25408):
>
>is so idiomatic, that using a set in it's place would be distracting.
>Rather that efficiently communicating the programmer's intention, it
>would raise in my mind the question "that's strange, why are they using a
>set there instead of a tuple?".

As a newby I'm really expecting a set here. The only reason my mind
goes in the direction of 3 items is that it makes no sense in
combination with ``in''.  That makes this idiom one that should be killed.

"
point1 = (0,1,0)
point2 = (1,0,0)
point3 = (0,0,1)
for i in (point1,point2, point3):

"  ???

I don't think so.
At least I would do

"
for i in [point1,point2,point3]:
statements
"

But I greatly prefer a set

"
for i in {point1,point2,point3}:
statements
"
Because a set is unorderded, this would convey to the
the compiler that it may evaluate the three statements concurrently.
For a list I expect the guarantee that the statements are
evaluated in order.
For a tuple I don't know what to expect. That alone is sufficient
reason not to use it here.

[Yes I know { } doesn't denote a set. I tried it. I don't
know how to denote a set ... ]

>--
>Steven

Groetjes Albert.

--
-- 
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
alb...@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-13 Thread Ben Finney
Mel  writes:

> The immutability makes it easier to talk about the semantic meanings.
> After you do
> > event_timestamp = (2009, 06, 04, 05, 02, 03)
> there's nothing that can happen to the tuple to invalidate
> > (year, month, day, hour, minute, second) = event_timestamp
> even though, as you say, there's nothing in the tuple to inform anybody 
> about the year, month, day, ... interpretation.

Also note that the stdlib ‘collections.namedtuple’ implementation
http://docs.python.org/library/collections.html#collections.namedtuple>
essentially acknowledges this: the names are assigned in advance to
index positions, tying a specific semantic meaning to each position.

-- 
 \   “Prediction is very difficult, especially of the future.” |
  `\   —Niels Bohr |
_o__)  |
Ben Finney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-13 Thread Mel
Gunter Henriksen wrote:
[ ... ]
> I guess to me, fundamentally, the interpretation of
> tuple as a sequence whose elements have semantic meaning
> implicitly defined by position is a relatively abstract
> intrepretation whose value is dubious relative to the
> value of immutability, since it seems like a shortcut
> which sacrifices explicitness for the sake of brevity.

The immutability makes it easier to talk about the semantic meanings.  After 
you do
> event_timestamp = (2009, 06, 04, 05, 02, 03)
there's nothing that can happen to the tuple to invalidate
> (year, month, day, hour, minute, second) = event_timestamp
even though, as you say, there's nothing in the tuple to inform anybody 
about the year, month, day, ... interpretation.

And of course there's nothing in a C struct object that isn't in the 
equivalent Python tuple.  The difference is that the C compiler has arranged 
all the outside code that uses the struct object to use it in the correct 
way.  The only object I've found in Python that truly replaces a struct 
object in C is a dict with string keys -- or an object that uses such a dict 
as its __dict__.


Mel.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-11 Thread Gunter Henriksen
> > >event_timestamp = (2009, 06, 04, 05, 02, 03)
> > >(year, month, day, hour, minute, second) = event_timestamp
> >
> > [...]
>
> The point of each position having a different semantic meaning is that
> tuple unpacking works as above. You need to know the meaning of each
> position in order to unpack it to separate names, as above.
>
> So two tuples that differ only in the sequence of their items are
> different in meaning. This is unlike a list, where the sequence of items
> does *not* affect the semantic meaning of each item.

I do not feel the above is significantly different enough from

event_timestamp = [2009, 06, 04, 05, 02, 03]
(year, month, day, hour, minute, second) = event_timestamp

event_timestamp = (2009, 06, 04, 05, 02, 03)
(year, month, day, hour, minute, second) = event_timestamp

event_timestamp = [2009, 06, 04, 05, 02, 03]
[year, month, day, hour, minute, second] = event_timestamp

to suggest tuples are really adding significant value
in this case, especially when I can do something like

event_timestamp = (2009, 06, 04, 05, 02, 03)
(year, month, day, hour, second, minute) = event_timestamp

and not have any indication I have done the wrong thing.

I guess to me, fundamentally, the interpretation of
tuple as a sequence whose elements have semantic meaning
implicitly defined by position is a relatively abstract
intrepretation whose value is dubious relative to the
value of immutability, since it seems like a shortcut
which sacrifices explicitness for the sake of brevity.

I would feel differently if seemed unusual to find good
Python code which iterates through the elements of a
tuple as a variable length homogenous ordered collection.
But then I would be wishing for immutable lists...
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-11 Thread Ben Finney
Gunter Henriksen  writes:

> > Try, then, this tuple:
> >
> >event_timestamp = (2009, 06, 04, 05, 02, 03)
> >(year, month, day, hour, minute, second) = event_timestamp
> 
> I totally agree about anything to do with immutability, I think the
> relative ordering of the elements in this example may be orthogonal to
> the concept of a tuple as an object whose elements have a semantic
> meaning implicitly defined by location in the sequence... in other
> words knowing that element i+1 is in some sense ordinally smaller than
> element i does not give me much information about what element i+1
> actually is.

The point of each position having a different semantic meaning is that
tuple unpacking works as above. You need to know the meaning of each
position in order to unpack it to separate names, as above.

So two tuples that differ only in the sequence of their items are
different in meaning. This is unlike a list, where the sequence of items
does *not* affect the semantic meaning of each item.

Note that I'm well aware that the language doesn't impose this as a hard
restriction; but that says more about Python's “consenting adults”
philosophy than anything else.

-- 
 \   “I went to a general store. They wouldn't let me buy anything |
  `\ specifically.” —Steven Wright |
_o__)  |
Ben Finney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-11 Thread Gunter Henriksen
> Try, then, this tuple:
>
>event_timestamp = (2009, 06, 04, 05, 02, 03)
>(year, month, day, hour, minute, second) = event_timestamp
>
> A list would be wrong for this value, because each position in the
> sequence has a specific meaning beyond its mere sequential position. Yet
> it also matters to the reader that these items are in a specific
> sequence, since that's a fairly standard ordering for those items.
>
> In this case, a tuple is superior to a list because it correctly conveys
> the semantic meaning of the overall value: the items must retain their
> sequential order to have the intended meaning, and to alter any one of
> them is conceptually to create a new timestamp value.

I totally agree about anything to do with immutability,
I think the relative ordering of the elements in this
example may be orthogonal to the concept of a tuple
as an object whose elements have a semantic meaning
implicitly defined by location in the sequence... in
other words knowing that element i+1 is in some sense
ordinally smaller than element i does not give me much
information about what element i+1 actually is.

To me a timestamp could be (date, time), or (days,
seconds, microseconds) (as in datetime.timedelta()), so
it is not clear to me that using a tuple as something
where the semantic meaning of the element at position i
should readily apparent would be the best approach
for timestamps, or enough to distinguish list and tuple
(in other words I am not suggesting a dict or class).

In the case of something like (x, y) or (real, imag),
or (longitude, latitude), or any case where there is
common agreement and understanding, such that using
names is arguably superfluous... I think in those
cases the concept makes sense of a tuple as a sequence
of attributes whose elements have a semantic meaning
implicitly defined by position in the sequence.  My
feeling is the number of cases where tuples are better
than lists for that is small relative to the number of
cases where tuple adds value as an immutable list.

I do not mean to be suggesting that a tuple should only
ever be used or thought of as a "frozenlist" though.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-11 Thread Ben Finney
Gunter Henriksen  writes:

> I think I would have difficulty holding a position that this should
> not be a class (or equivalent via namedtuple()) or a dict. It seems to
> me like a case could be made that there are far more situations where
> it makes sense to use tuples as immutable sequences than as objects
> whose attributes are named implicitly by an index. This dodge_city
> definitely does not seem to me like a good candidate for a plain
> tuple.

It's a fair cop. (I only meant that for this example a tuple was
superior to a list, but you're right that a dict would be better than
either.)

Try, then, this tuple:

event_timestamp = (2009, 06, 04, 05, 02, 03)
(year, month, day, hour, minute, second) = event_timestamp

A list would be wrong for this value, because each position in the
sequence has a specific meaning beyond its mere sequential position. Yet
it also matters to the reader that these items are in a specific
sequence, since that's a fairly standard ordering for those items.

In this case, a tuple is superior to a list because it correctly conveys
the semantic meaning of the overall value: the items must retain their
sequential order to have the intended meaning, and to alter any one of
them is conceptually to create a new timestamp value.

-- 
 \ “[The RIAA] have the patience to keep stomping. They're playing |
  `\ whack-a-mole with an infinite supply of tokens.” —kennon, |
_o__) http://kuro5hin.org/ |
Ben Finney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-11 Thread Gunter Henriksen
> [In this tuple]
>dodge_city = (1781, 1870, 1823)
>(population, feet_above_sea_level, establishment_year) = dodge_city
> each index in the sequence implies something very
> different about each value. The semantic meaning
> of each index is *more* than just the position in
> the sequence; it matters *for interpreting that
> component*, and that component would not mean the
> same thing in a different index position. A tuple
> is the right choice, for that reason.

I think I would have difficulty holding a position
that this should not be a class (or equivalent via
namedtuple()) or a dict.  It seems to me like a case
could be made that there are far more situations where
it makes sense to use tuples as immutable sequences than
as objects whose attributes are named implicitly by an
index.  This dodge_city definitely does not seem to me
like a good candidate for a plain tuple.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-11 Thread Gabriel Genellina

En Tue, 09 Jun 2009 05:02:33 -0300, Steven D'Aprano
 escribió:

[...] As tuples are defined in Python, they quack like immutable lists,  
they

walk like immutable lists, and they swim like immutable lists. Why
shouldn't we treat them as immutable lists?

Phillip Eby states that "Lists are intended to be homogeneous sequences,
while tuples are heterogeneous data structures." (Notice the subtle shift
there: lists are "intended", while tuples "are". But in fact, there's
nothing to stop you from putting homogeneous data into a tuple, so Eby is
wrong to say that tuples *are* heterogeneous.)

Perhaps Eby intends lists to be homogeneous, perhaps Guido does too, but
this is Python, where we vigorously defend the right to shoot ourselves
in the foot. We strongly discourage class creators from trying to enforce
their intentions by using private attributes, and even when we allow such
a thing, the nature of Python is that nothing is truly private. Why
should homogeneity and heterogeneity of lists and tuples be sacrosanct?
Nothing stops me from putting hetereogeneous data into a list, or
homogeneous data into a tuple, and there doesn't appear to be any ill-
effects from doing so. Why give lose sleep over the alleged lack of
purity?


Yes - but in the past the distinction was very much stronger. I think that
tuples didn't have *any* method until Python 2.0 -- so, even if someone
could consider a tuple a "read-only list", the illusion disappeared as
soon as she tried to write anything more complex that a[i]. Maybe tuples
could quack like immutable lists, but they could not swim nor walk...

With time, tuples gained more and more methods and are now very similar to
lists - they even have an index() method (undocumented but obvious) which
is absurd in the original context. Think of tuples as used in relational
databases: there is no way in SQL to express the condition "search for
this along all values in this tuple", because it usually doesn't make any
sense at all (and probably, if it does make sense in a certain case, it's
because the database is badly designed.)

But *now*, you can express that operation in Python. So I'd say that
*now*, the distinction between an "homogeneous container" vs
"heterogeneous data structure" has vanished a lot, and it's hard to
convince people that tuples aren't just immutable lists. That is, *I*
would have used a list in this case:

for delay in (0.01, 0.1, 0.5, 1, 2, 5, 10, 30, 60):
   do_something(delay)

but I cannot find a *concrete* reason to support the assertion "list is
better".

So, for practical purposes, tuples act now as if they were immutable lists
-- one should be aware of the different memory allocation strategies, but
I see no other relevant differences.

--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-09 Thread Steven D'Aprano
On Tue, 09 Jun 2009 04:57:48 -0700, samwyse wrote:

> Time to test things!   I'm going to compare three things using Python
> 3.0:
>   X={...}\nS=lambda x: x in X
>   S=lambda x: x in {...}
>   S=lambda x: x in (...)
> where the ... is replaced by lists of integers of various lengths.
> Here's the test bed:

[snip]

Hmmm... I think your test-bed is unnecessarily complicated, making it 
difficult to see what is going on. Here's my version, with lists included 
for completeness. Each test prints the best of five trials of one million 
repetitions of ten successful searches, then does the same thing again 
for unsuccessful searches.


from timeit import Timer

def test(size):
global s, l, t, targets
print("Testing search with size %d" % size)
rng = range(size)
s, l, t = set(rng), list(rng), tuple(rng)
# Calculate a (more or less) evenly distributed set of ten 
# targets to search for, including both end points.
targets = [i*size//9 for i in range(9)] + [size-1]
assert len(targets) == 10
setup = "from __main__ import targets, %s"
body = "for i in targets: i in %s"
# Run a series of successful searches.
for name in "s l t".split():
obj = globals()[name]
secs = min(Timer(body % name, setup % name).repeat(repeat=5))
print("Successful search in %s: %f s" % (type(obj), secs))
# Also run unsuccessful tests.
targets = [size+x for x in targets]
for name in "s l t".split():
obj = globals()[name]
secs = min(Timer(body % name, setup % name).repeat(repeat=5))
print("Unsuccessful search in %s: %f s" % (type(obj), secs))

Results are:

>>> test(1)
Testing search with size 1
Successful search in : 1.949509 s
Successful search in : 1.838387 s
Successful search in : 1.876309 s
Unsuccessful search in : 1.998207 s
Unsuccessful search in : 2.148660 s
Unsuccessful search in : 2.137041 s
>>>
>>>
>>> test(10)
Testing search with size 10
Successful search in : 1.943664 s
Successful search in : 3.659786 s
Successful search in : 3.569164 s
Unsuccessful search in : 1.935553 s
Unsuccessful search in : 5.833665 s
Unsuccessful search in : 5.573177 s
>>>
>>>
>>> test(100)
Testing search with size 100
Successful search in : 1.907839 s
Successful search in : 21.704032 s
Successful search in : 21.391875 s
Unsuccessful search in : 1.916241 s
Unsuccessful search in : 41.178029 s
Unsuccessful search in : 41.856226 s
>>>
>>>
>>> test(1000)
Testing search with size 1000
Successful search in : 2.256150 s
Successful search in : 189.991579 s
Successful search in : 187.349630 s
Unsuccessful search in : 1.869202 s
Unsuccessful search in : 398.451284 s
Unsuccessful search in : 388.544178 s




As expected, lists and tuples are equally as fast (or slow if you 
prefer). Successful searches are about twice as fast as unsuccessful 
ones, and performance suffers as the size of the list/tuple increases. 
However, sets are nearly just as fast no matter the size of the set, or 
whether the search is successfully or unsuccessful.



> You will note that testing against a list constant is just as fast as
> testing against a set.  This was surprising for me; apparently the
> __contains__ operator turns a tuple into a set.

I doubt that very much.


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-09 Thread Terry Reedy

m...@pixar.com wrote:

John Machin  wrote:

T=lambda x:x in(25401,25402,25408);import dis;dis.dis(L);dis.dis(T)


I've learned a lot from this thread, but this is the
niftiest bit I've picked up... thanks!


If you are doing a lot of dissing, starting with
from dis import dis
saves subsequent typing.

tjr


--
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-09 Thread Terry Reedy

Steven D'Aprano wrote:


James Tauber explains this at
http://jtauber.com/blog/2006/04/15/
python_tuples_are_not_just_constant_lists/>.



He doesn't really explain anything though, he merely states it as 
revealed wisdom. The closest he comes to an explanation is to declare 
that in tuples "the index in a tuple has an implied semantic. The point 
of a tuple is that the i-th slot means something specific. In other 
words, it's a index-based (rather than name based) datastructure." But he 
gives no reason for why we should accept that as true for tuples but not 
lists.


It may be that that's precisely the motivation Guido had when he 
introduced tuples into Python, but why should we not overload tuples with 
more meanings than Guido (hypothetically) imagined? In other words, why 
*shouldn't* we treat tuples as immutable lists, if that helps us solve a 
problem effectively?


I believe that we should overload tuples with *less* specific meaning 
than originally.  In 3.0, tuples have *all* the general sequence 
operations and methods, including .index() and .count().  This was not 
true in 2.5 (don't know about 2.6), which is why tuples are yet not 
documented as having those two methods (reported in

http://bugs.python.org/issue4966
).  Operationally, they are now general immutable sequences.  Period.

Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-09 Thread mh
John Machin  wrote:
> T=lambda x:x in(25401,25402,25408);import dis;dis.dis(L);dis.dis(T)

I've learned a lot from this thread, but this is the
niftiest bit I've picked up... thanks!

-- 
Mark Harrison
Pixar Animation Studios
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-09 Thread Carl Banks
On Jun 9, 8:20 am, samwyse  wrote:
> On Jun 9, 12:30 am, Emile van Sebille  wrote:
>
> > On 6/8/2009 8:43 PM Ben Finney said...
> > > The fact that literal set syntax is a relative newcomer is the primary
> > > reason for that, I'd wager.
>
> > Well, no.  It really is more, "that's odd... why use set?"
>
> Until I ran some timing tests this morning, I'd have said that sets
> could determine membership faster than a list, but that's apparently
> not true,

See my reply to that post.  I believe your tests were flawed.

> assuming that the list has less than 8K members.  Above 16K
> members, sets are much faster than lists.  I'm not sure where the
> break is, or even why there's a break.

The break comes from the compiler, not the objects themselves.


Carl Banks
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-09 Thread Carl Banks
On Jun 9, 4:57 am, samwyse  wrote:
> On Jun 8, 8:57 pm, samwyse  wrote:
>
> > I conclude that using constructors is generally a bad idea, since the
> > compiler doesn't know if you're calling the builtin or something with
> > an overloaded name.  I presume that the compiler will eventually
> > optimize the second example to match the last, but both of them use
> > the BUILD_SET opcode.  I expect that this can be expensive for long
> > lists, so I don't think that it's a good idea to use set constants
> > inside loops.  Instead it should be assigned to a global or class
> > variable.
>
> Time to test things!   I'm going to compare three things using Python
> 3.0:
>   X={...}\nS=lambda x: x in X
>   S=lambda x: x in {...}
>   S=lambda x: x in (...)
> where the ... is replaced by lists of integers of various lengths.
> Here's the test bed:
>
> from random import seed, sample
> from timeit import Timer
> maxint = 2**31-1
> values = list(map(lambda n: 2**n-1, range(1,16)))
> def append_numbers(k, setup):
>     seed(1968740928)
>     for i in sample(range(maxint), k):
>         setup.append(str(i))
>         setup.append(',')
> print('==', 'separate set constant')
> for n in values[::2]:
>     print('===', n, 'values')
>     setup = ['X={']
>     append_numbers(n, setup)
>     setup.append('}\nS=lambda x: x in X')
>     t = Timer('S(88632719)', ''.join(setup))
>     print(t.repeat())
> print('==', 'in-line set constant')
> for n in values[:4]:
>     print('===', n, 'values')
>     setup = ['S=lambda x: x in {']
>     append_numbers(n, setup)
>     setup.append('}')
>     t = Timer('S(88632719)', ''.join(setup))
>     print(t.repeat())
> print('==', 'in-line list constant')
> for n in values:
>     print('===', n, 'values')
>     setup = ['S=lambda x: x in (']
>     append_numbers(n, setup)
>     setup.append(')')
>     t = Timer('S(88632719)', ''.join(setup))
>     print(t.repeat())

It looks like you are evaluating the list/set/tuple every pass, and
then, for lists and tuples, always indexing the first item.


> And here are the results.  There's something interesting at the very
> end.
[snip results showing virtually identical performance for list, set,
and tuple]
> You will note that testing against a list constant is just as fast as
> testing against a set.  This was surprising for me; apparently the
> __contains__ operator turns a tuple into a set.

Given the way you wrote the test it this is hardly surprising.

I would expect "item in list" to have comparable execution time to
"item in set" if item is always the first element in list.

Furthermore, the Python compiler appears to be optimizing this
specific case to always use a precompiled set.  Well, almost
always


> You will also note
> that  performance to fall off drastically for the last set of values.
> I'm not sure what happens there; I guess I'll file a bug report.

Please don't; it's not a bug.  The slowdown is because at sizes above
a certain threshold the Python compiler doesn't try to precompile in-
line lists, sets, and tuples.  The last case was above that limit.


Carl Banks
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-09 Thread samwyse
On Jun 9, 12:30 am, Emile van Sebille  wrote:
> On 6/8/2009 8:43 PM Ben Finney said...

> > The fact that literal set syntax is a relative newcomer is the primary
> > reason for that, I'd wager.
>
> Well, no.  It really is more, "that's odd... why use set?"

Until I ran some timing tests this morning, I'd have said that sets
could determine membership faster than a list, but that's apparently
not true, assuming that the list has less than 8K members.  Above 16K
members, sets are much faster than lists.  I'm not sure where the
break is, or even why there's a break.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-09 Thread Ben Finney
Steven D'Aprano  writes:

> On Tue, 09 Jun 2009 09:43:45 +1000, Ben Finney wrote:
> 
> > Use a list when the semantic meaning of an item doesn't depend on
> > all the other items: it's “only” a collection of values.
> > 
> > Your list of message codes is a good example: if a value appears at
> > index 3, that doesn't make it mean something different from the same
> > value appearing at index 2.
> 
> That advice would seem to imply that lists shouldn't be ordered.

No such implication. Order is important in a list, it just doesn't
change the semantic meaning of the value.

> If a list of values has an order, it implies that "first place" (index
> 0) is different from "second place", by virtue of the positions they
> appear in the list. The lists:
> 
> presidential_candidates_sorted_by_votes = ['Obama', 'McCain']
> presidential_candidates_sorted_by_votes = ['McCain', 'Obama']
> 
> have very different meanings.

But the semantic meaning if each value is unchanged: each is still a
presidential candidate's surname. The additional semantic meaning of
putting it in a list is no more than the position in the sequence. A
list is the right choice, for that reason.


Whereas, for example, in this tuple:

dodge_city = (1781, 1870, 1823)
(population, feet_above_sea_level, establishment_year) = dodge_city

each index in the sequence implies something very different about each
value. The semantic meaning of each index is *more* than just the
position in the sequence; it matters *for interpreting that component*,
and that component would not mean the same thing in a different index
position. A tuple is the right choice, for that reason.

-- 
 \   “Are you pondering what I'm pondering?” “Umm, I think so, Don |
  `\  Cerebro, but, umm, why would Sophia Loren do a musical?” |
_o__)   —_Pinky and The Brain_ |
Ben Finney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-09 Thread samwyse
On Jun 8, 8:57 pm, samwyse  wrote:
> I conclude that using constructors is generally a bad idea, since the
> compiler doesn't know if you're calling the builtin or something with
> an overloaded name.  I presume that the compiler will eventually
> optimize the second example to match the last, but both of them use
> the BUILD_SET opcode.  I expect that this can be expensive for long
> lists, so I don't think that it's a good idea to use set constants
> inside loops.  Instead it should be assigned to a global or class
> variable.

Time to test things!   I'm going to compare three things using Python
3.0:
  X={...}\nS=lambda x: x in X
  S=lambda x: x in {...}
  S=lambda x: x in (...)
where the ... is replaced by lists of integers of various lengths.
Here's the test bed:

from random import seed, sample
from timeit import Timer
maxint = 2**31-1
values = list(map(lambda n: 2**n-1, range(1,16)))
def append_numbers(k, setup):
seed(1968740928)
for i in sample(range(maxint), k):
setup.append(str(i))
setup.append(',')
print('==', 'separate set constant')
for n in values[::2]:
print('===', n, 'values')
setup = ['X={']
append_numbers(n, setup)
setup.append('}\nS=lambda x: x in X')
t = Timer('S(88632719)', ''.join(setup))
print(t.repeat())
print('==', 'in-line set constant')
for n in values[:4]:
print('===', n, 'values')
setup = ['S=lambda x: x in {']
append_numbers(n, setup)
setup.append('}')
t = Timer('S(88632719)', ''.join(setup))
print(t.repeat())
print('==', 'in-line list constant')
for n in values:
print('===', n, 'values')
setup = ['S=lambda x: x in (']
append_numbers(n, setup)
setup.append(')')
t = Timer('S(88632719)', ''.join(setup))
print(t.repeat())

And here are the results.  There's something interesting at the very
end.

== separate set constant
=== 1 values
[0.26937306277753176, 0.26113626173158877, 0.2692190487889]
=== 7 values
[0.26583266867716426, 0.27223543774418268, 0.27681646689732919]
=== 31 values
[0.25089725090758752, 0.25562690230182894, 0.25844625504079444]
=== 127 values
[0.32404313956103392, 0.33048948958596691, 0.34487930728626104]
=== 511 values
[0.27574566041214732, 0.26991838348169983, 0.28309016928129083]
=== 2047 values
[0.27826162263639631, 0.27337357122204065, 0.26888752620793976]
=== 8191 values
[0.27479134917985437, 0.27955955295994261, 0.27740676538498654]
=== 32767 values
[0.26189725230441319, 0.25949247739587022, 0.2537356004743625]
== in-line set constant
=== 1 values
[0.43579086168772818, 0.4231755711968983, 0.42178740594125852]
=== 3 values
[0.54712875519095228, 0.55325048295244272, 0.54346991028189251]
=== 7 values
[1.1897654590178366, 1.1763383335032813, 1.2009900699669931]
=== 15 values
[1.7661906750718313, 1.7585005915556291, 1.7405896559478933]
== in-line list constant
=== 1 values
[0.23651385860493335, 0.24746972031361381, 0.23778469051234197]
=== 3 values
[0.23710750947396875, 0.23205630883254713, 0.23345592805789295]
=== 7 values
[0.24607764394636789, 0.23551903943099006, 0.24241377046524093]
=== 15 values
[0.2279376289444599, 0.22491908887861456, 0.24076747184349045]
=== 31 values
[0.22860084172708994, 0.233022074034551, 0.23138639128715965]
=== 63 values
[0.23671639831319169, 0.23404259479906031, 0.22269394573891077]
=== 127 values
[0.22754176857673158, 0.22818151468971593, 0.22711154629987718]
=== 255 values
[0.23503126794047802, 0.24493699618247788, 0.26690207833677349]
=== 511 values
[0.24518255811842238, 0.23878118587697728, 0.22844830837438934]
=== 1023 values
[0.23285585179122137, 0.24067220833932623, 0.23807439213642922]
=== 2047 values
[0.24206484343680756, 0.24352201187581102, 0.24366253252857462]
=== 4095 values
[0.24624526301527183, 0.23692145230748807, 0.23829956041899081]
=== 8191 values
[0.22246514570986164, 0.22435309515595137, 0.011456761]
=== 16383 values
[194.29462683106374, 193.21789529116128, 193.25843228678508]
=== 32767 values

You will note that testing against a list constant is just as fast as
testing against a set.  This was surprising for me; apparently the
__contains__ operator turns a tuple into a set.  You will also note
that  performance to fall off drastically for the last set of values.
I'm not sure what happens there; I guess I'll file a bug report.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-09 Thread samwyse
On Jun 8, 10:06 pm, Chris Rebert  wrote:
> On Mon, Jun 8, 2009 at 6:57 PM, samwyse wrote:
> > On Jun 8, 7:37 pm, Carl Banks  wrote:
> >> On Jun 8, 4:43 pm, Ben Finney  wrote:
> >> > m...@pixar.com writes:
> >> > > Is there any reason to prefer one or the other of these statements?
>
> >> > >         if e.message.code in [25401,25402,25408]:
> >> > >         if e.message.code in (25401,25402,25408):
>
> >> If you want to go strictly by the book, I would say he ought to be
> >> using a set since his collection of numbers has no meaningful order
> >> nor does it make sense to list any item twice.
>
> > As the length of the list increases, the increased speeds of looking
> > something up makes using a set makes more sense.  But what's the best
> > way to express this?  Here are a few more comparisons (using Python
> > 3.0)...
>
>  S=lambda x:x in set((25401,25402,25408))
>  dis(S)
> >  1           0 LOAD_FAST                0 (x)
> >              3 LOAD_GLOBAL              0 (set)
> >              6 LOAD_CONST               3 ((25401, 25402, 25408))
> >              9 CALL_FUNCTION            1
> >             12 COMPARE_OP               6 (in)
> >             15 RETURN_VALUE
>  S=lambda x:x in{25401,25402,25408}
>  dis(S)
> >  1           0 LOAD_FAST                0 (x)
> >              3 LOAD_CONST               0 (25401)
> >              6 LOAD_CONST               1 (25402)
> >              9 LOAD_CONST               2 (25408)
> >             12 BUILD_SET                3
> >             15 COMPARE_OP               6 (in)
> >             18 RETURN_VALUE
>  S=lambda x:x in{(25401,25402,25408)}
>  dis(S)
> >  1           0 LOAD_FAST                0 (x)
> >              3 LOAD_CONST               3 ((25401, 25402, 25408))
> >              6 BUILD_SET                1
> >              9 COMPARE_OP               6 (in)
> >             12 RETURN_VALUE
>
> > I conclude that using constructors is generally a bad idea, since the
> > compiler doesn't know if you're calling the builtin or something with
> > an overloaded name.  I presume that the compiler will eventually
> > optimize the second example to match the last, but both of them use
> > the BUILD_SET opcode.  I expect that this can be expensive for long
>
> Erm, unless I misunderstand you somehow, the second example will and
> should *never* match the last.
> The set {25401,25402,25408}, containing 3 integer elements, is quite
> distinct from the set {(25401,25402,25408)}, containing one element
> and that element is a tuple.
> set(X) != {X}; set([X]) = {X}

D'oh!  I was thinking about how you can initialize a set from an
iterator and for some reason thought that you could do the same with a
set constant.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-09 Thread Steven D'Aprano
On Tue, 09 Jun 2009 09:43:45 +1000, Ben Finney wrote:

> Use a list when the semantic meaning of an item doesn't depend on all
> the other items: it's “only” a collection of values.
> 
> Your list of message codes is a good example: if a value appears at
> index 3, that doesn't make it mean something different from the same
> value appearing at index 2.

That advice would seem to imply that lists shouldn't be ordered. If a 
list of values has an order, it implies that "first place" (index 0) is 
different from "second place", by virtue of the positions they appear in 
the list. The lists:

presidential_candidates_sorted_by_votes = ['Obama', 'McCain']
presidential_candidates_sorted_by_votes = ['McCain', 'Obama']

have very different meanings. Prohibiting the use of lists in the context 
of ordered data is surely is an unfortunate consequence of your advice.


> James Tauber explains this at
> http://jtauber.com/blog/2006/04/15/
> python_tuples_are_not_just_constant_lists/>.


He doesn't really explain anything though, he merely states it as 
revealed wisdom. The closest he comes to an explanation is to declare 
that in tuples "the index in a tuple has an implied semantic. The point 
of a tuple is that the i-th slot means something specific. In other 
words, it's a index-based (rather than name based) datastructure." But he 
gives no reason for why we should accept that as true for tuples but not 
lists.

It may be that that's precisely the motivation Guido had when he 
introduced tuples into Python, but why should we not overload tuples with 
more meanings than Guido (hypothetically) imagined? In other words, why 
*shouldn't* we treat tuples as immutable lists, if that helps us solve a 
problem effectively?

To put it another way, I think the question of whether or not tuples are 
immutable lists has the answer Mu. Sometimes they are, sometimes they're 
not. I have no problem with the title of the quoted blog post -- that 
tuples are not *just* constant lists -- but I do dispute that there is 
any reason for declaring that tuples must not be used as constant lists. 
As tuples are defined in Python, they quack like immutable lists, they 
walk like immutable lists, and they swim like immutable lists. Why 
shouldn't we treat them as immutable lists?

Phillip Eby states that "Lists are intended to be homogeneous sequences, 
while tuples are heterogeneous data structures." (Notice the subtle shift 
there: lists are "intended", while tuples "are". But in fact, there's 
nothing to stop you from putting homogeneous data into a tuple, so Eby is 
wrong to say that tuples *are* heterogeneous.)

Perhaps Eby intends lists to be homogeneous, perhaps Guido does too, but 
this is Python, where we vigorously defend the right to shoot ourselves 
in the foot. We strongly discourage class creators from trying to enforce 
their intentions by using private attributes, and even when we allow such 
a thing, the nature of Python is that nothing is truly private. Why 
should homogeneity and heterogeneity of lists and tuples be sacrosanct? 
Nothing stops me from putting hetereogeneous data into a list, or 
homogeneous data into a tuple, and there doesn't appear to be any ill-
effects from doing so. Why give lose sleep over the alleged lack of 
purity?



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-08 Thread Emile van Sebille

On 6/8/2009 8:43 PM Ben Finney said...

Steven D'Aprano  writes:


In addition, using a tuple or a list in this context:

if e.message.code in (25401,25402,25408):

is so idiomatic, that using a set in it's place would be distracting.


I think a list in that context is fine, and that's the idiom I see far
more often than a tuple.


Rather that efficiently communicating the programmer's intention, it
would raise in my mind the question "that's strange, why are they
using a set there instead of a tuple?".


The fact that literal set syntax is a relative newcomer is the primary
reason for that, I'd wager.



Well, no.  It really is more, "that's odd... why use set?"

Emile

--
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-08 Thread Ben Finney
Steven D'Aprano  writes:

> In addition, using a tuple or a list in this context:
> 
> if e.message.code in (25401,25402,25408):
> 
> is so idiomatic, that using a set in it's place would be distracting.

I think a list in that context is fine, and that's the idiom I see far
more often than a tuple.

> Rather that efficiently communicating the programmer's intention, it
> would raise in my mind the question "that's strange, why are they
> using a set there instead of a tuple?".

The fact that literal set syntax is a relative newcomer is the primary
reason for that, I'd wager.

-- 
 \   “If you are unable to leave your room, expose yourself in the |
  `\window.” —instructions in case of fire, hotel, Finland |
_o__)  |
Ben Finney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-08 Thread Steven D'Aprano
On Tue, 09 Jun 2009 11:02:54 +1000, Ben Finney wrote:

> Carl Banks  writes:
> 
>> If you want to go strictly by the book, I would say he ought to be
>> using a set since his collection of numbers has no meaningful order nor
>> does it make sense to list any item twice.
> 
> Yes, a set would be best for this specific situation.
> 
>> I don't think it's very important, however, to stick to rules like that
>> for objects that don't live for more than a single line of code.
> 
> It's important to the extent that it's important to express one's
> *meaning*. Program code should be written primarily as a means of
> communicating with other programmers, and only incidentally for the
> computer to execute.


But practicality beats purity -- there are many scenarios where we make 
compromises in our meaning in order to get correct, efficient code. E.g. 
we use floats, despite them being a poor substitute for the abstract Real 
numbers we mean.

In addition, using a tuple or a list in this context:

if e.message.code in (25401,25402,25408):

is so idiomatic, that using a set in it's place would be distracting. 
Rather that efficiently communicating the programmer's intention, it 
would raise in my mind the question "that's strange, why are they using a 
set there instead of a tuple?".


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-08 Thread Chris Rebert
On Mon, Jun 8, 2009 at 6:57 PM, samwyse wrote:
> On Jun 8, 7:37 pm, Carl Banks  wrote:
>> On Jun 8, 4:43 pm, Ben Finney  wrote:
>> > m...@pixar.com writes:
>> > > Is there any reason to prefer one or the other of these statements?
>>
>> > >         if e.message.code in [25401,25402,25408]:
>> > >         if e.message.code in (25401,25402,25408):
>>
>> If you want to go strictly by the book, I would say he ought to be
>> using a set since his collection of numbers has no meaningful order
>> nor does it make sense to list any item twice.
>
> As the length of the list increases, the increased speeds of looking
> something up makes using a set makes more sense.  But what's the best
> way to express this?  Here are a few more comparisons (using Python
> 3.0)...
>
 S=lambda x:x in set((25401,25402,25408))
 dis(S)
>  1           0 LOAD_FAST                0 (x)
>              3 LOAD_GLOBAL              0 (set)
>              6 LOAD_CONST               3 ((25401, 25402, 25408))
>              9 CALL_FUNCTION            1
>             12 COMPARE_OP               6 (in)
>             15 RETURN_VALUE
 S=lambda x:x in{25401,25402,25408}
 dis(S)
>  1           0 LOAD_FAST                0 (x)
>              3 LOAD_CONST               0 (25401)
>              6 LOAD_CONST               1 (25402)
>              9 LOAD_CONST               2 (25408)
>             12 BUILD_SET                3
>             15 COMPARE_OP               6 (in)
>             18 RETURN_VALUE
 S=lambda x:x in{(25401,25402,25408)}
 dis(S)
>  1           0 LOAD_FAST                0 (x)
>              3 LOAD_CONST               3 ((25401, 25402, 25408))
>              6 BUILD_SET                1
>              9 COMPARE_OP               6 (in)
>             12 RETURN_VALUE
>
> I conclude that using constructors is generally a bad idea, since the
> compiler doesn't know if you're calling the builtin or something with
> an overloaded name.  I presume that the compiler will eventually
> optimize the second example to match the last, but both of them use
> the BUILD_SET opcode.  I expect that this can be expensive for long

Erm, unless I misunderstand you somehow, the second example will and
should *never* match the last.
The set {25401,25402,25408}, containing 3 integer elements, is quite
distinct from the set {(25401,25402,25408)}, containing one element
and that element is a tuple.
set(X) != {X}; set([X]) = {X}

Cheers,
Chris
-- 
http://blog.rebertia.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-08 Thread Charles Yeomans


On Jun 8, 2009, at 9:28 PM, Carl Banks wrote:


On Jun 8, 6:02 pm, Ben Finney  wrote:

Carl Banks  writes:

If you want to go strictly by the book, I would say he ought to be
using a set since his collection of numbers has no meaningful order
nor does it make sense to list any item twice.


Yes, a set would be best for this specific situation.


I don't think it's very important, however, to stick to rules like
that for objects that don't live for more than a single line of  
code.


It's important to the extent that it's important to express one's
*meaning*. Program code should be written primarily as a means of
communicating with other programmers, and only incidentally for the
computer to execute.


Which is precisely why isn't not very important for an object that
exists for one line.  No programmer is ever going to be confused about
the meaning of this:

if a in (1,2,3):



Actually, I might be -- I think of a tuple first as a single thing, as  
opposed to a list or map, which I see first as a collection of other  
things.


Charles Yeomans
--
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-08 Thread samwyse
On Jun 8, 7:37 pm, Carl Banks  wrote:
> On Jun 8, 4:43 pm, Ben Finney  wrote:
> > m...@pixar.com writes:
> > > Is there any reason to prefer one or the other of these statements?
>
> > >         if e.message.code in [25401,25402,25408]:
> > >         if e.message.code in (25401,25402,25408):
>
> If you want to go strictly by the book, I would say he ought to be
> using a set since his collection of numbers has no meaningful order
> nor does it make sense to list any item twice.

As the length of the list increases, the increased speeds of looking
something up makes using a set makes more sense.  But what's the best
way to express this?  Here are a few more comparisons (using Python
3.0)...

>>> S=lambda x:x in set((25401,25402,25408))
>>> dis(S)
  1   0 LOAD_FAST0 (x)
  3 LOAD_GLOBAL  0 (set)
  6 LOAD_CONST   3 ((25401, 25402, 25408))
  9 CALL_FUNCTION1
 12 COMPARE_OP   6 (in)
 15 RETURN_VALUE
>>> S=lambda x:x in{25401,25402,25408}
>>> dis(S)
  1   0 LOAD_FAST0 (x)
  3 LOAD_CONST   0 (25401)
  6 LOAD_CONST   1 (25402)
  9 LOAD_CONST   2 (25408)
 12 BUILD_SET3
 15 COMPARE_OP   6 (in)
 18 RETURN_VALUE
>>> S=lambda x:x in{(25401,25402,25408)}
>>> dis(S)
  1   0 LOAD_FAST0 (x)
  3 LOAD_CONST   3 ((25401, 25402, 25408))
  6 BUILD_SET1
  9 COMPARE_OP   6 (in)
 12 RETURN_VALUE

I conclude that using constructors is generally a bad idea, since the
compiler doesn't know if you're calling the builtin or something with
an overloaded name.  I presume that the compiler will eventually
optimize the second example to match the last, but both of them use
the BUILD_SET opcode.  I expect that this can be expensive for long
lists, so I don't think that it's a good idea to use set constants
inside loops.  Instead it should be assigned to a global or class
variable.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-08 Thread Carl Banks
On Jun 8, 6:02 pm, Ben Finney  wrote:
> Carl Banks  writes:
> > If you want to go strictly by the book, I would say he ought to be
> > using a set since his collection of numbers has no meaningful order
> > nor does it make sense to list any item twice.
>
> Yes, a set would be best for this specific situation.
>
> > I don't think it's very important, however, to stick to rules like
> > that for objects that don't live for more than a single line of code.
>
> It's important to the extent that it's important to express one's
> *meaning*. Program code should be written primarily as a means of
> communicating with other programmers, and only incidentally for the
> computer to execute.

Which is precisely why isn't not very important for an object that
exists for one line.  No programmer is ever going to be confused about
the meaning of this:

if a in (1,2,3):


Carl Banks
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-08 Thread Ben Finney
Carl Banks  writes:

> If you want to go strictly by the book, I would say he ought to be
> using a set since his collection of numbers has no meaningful order
> nor does it make sense to list any item twice.

Yes, a set would be best for this specific situation.

> I don't think it's very important, however, to stick to rules like
> that for objects that don't live for more than a single line of code.

It's important to the extent that it's important to express one's
*meaning*. Program code should be written primarily as a means of
communicating with other programmers, and only incidentally for the
computer to execute.

-- 
 \“Laurie got offended that I used the word ‘puke’. But to me, |
  `\ that's what her dinner tasted like.” —Jack Handey |
_o__)  |
Ben Finney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-08 Thread Carl Banks
On Jun 8, 4:43 pm, Ben Finney  wrote:
> m...@pixar.com writes:
> > Is there any reason to prefer one or the other of these statements?
>
> >         if e.message.code in [25401,25402,25408]:
> >         if e.message.code in (25401,25402,25408):
>
> > I'm currently using [], but only coz I think it's prettier
> > than ().
>
> Use a list when the semantic meaning of an item doesn't depend on all
> the other items: it's “only” a collection of values.
>
> Your list of message codes is a good example: if a value appears at
> index 3, that doesn't make it mean something different from the same
> value appearing at index 2.
>
> Use a tuple when the semantic meaning of the items are bound together,
> and it makes more sense to speak of all the items as a single structured
> value.

If you want to go strictly by the book, I would say he ought to be
using a set since his collection of numbers has no meaningful order
nor does it make sense to list any item twice.

I don't think it's very important, however, to stick to rules like
that for objects that don't live for more than a single line of code.


Carl Banks
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-08 Thread Ben Finney
m...@pixar.com writes:

> Is there any reason to prefer one or the other of these statements?
> 
> if e.message.code in [25401,25402,25408]:
> if e.message.code in (25401,25402,25408):
> 
> I'm currently using [], but only coz I think it's prettier
> than ().

Use a list when the semantic meaning of an item doesn't depend on all
the other items: it's “only” a collection of values.

Your list of message codes is a good example: if a value appears at
index 3, that doesn't make it mean something different from the same
value appearing at index 2.


Use a tuple when the semantic meaning of the items are bound together,
and it makes more sense to speak of all the items as a single structured
value.

The classic examples are point coordinates and timestamps: rather than a
collection of values, it makes more sense to think of each coordinate
set or timestamp as a single complex value. The value 7 appearing at
index 2 would have a completely different meaning from the value 7
appearing at index 3.


James Tauber explains this at
http://jtauber.com/blog/2006/04/15/python_tuples_are_not_just_constant_lists/>.

-- 
 \   “Pinky, are you pondering what I'm pondering?” “Well, I think |
  `\  so, Brain, but pantyhose are so uncomfortable in the |
_o__)  summertime.” —_Pinky and The Brain_ |
Ben Finney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-08 Thread John Machin
  pixar.com> writes:

> 
> Is there any reason to prefer one or the other of these statements?
> 
> if e.message.code in [25401,25402,25408]:
> if e.message.code in (25401,25402,25408):
> 

>From the viewpoint of relative execution speed, in the above case
if it matters at all it matters only on Python 2.4 AFAICT:

| >>> L=lambda x:x in[25401,25402,25408];
T=lambda x:x in(25401,25402,25408);import dis;dis.dis(L);dis.dis(T)
  1   0 LOAD_FAST0 (x)
  3 LOAD_CONST   1 (25401)
  6 LOAD_CONST   2 (25402)
  9 LOAD_CONST   3 (25408)
 12 BUILD_LIST   3
 15 COMPARE_OP   6 (in)
 18 RETURN_VALUE
  1   0 LOAD_FAST0 (x)
  3 LOAD_CONST   4 ((25401, 25402, 25408))
  6 COMPARE_OP   6 (in)
  9 RETURN_VALUE

Earlier versions build the list or tuple at run time
(as for the list above); later versions detect that
the list can't be mutated and generate the same code
for both the list and tuple.

However there are limits to the analysis that can be
performed e.g. if the list is passed to a function,
pursuit halts at the county line:

[Python 2.6.2]
| >>> F=lambda y,z:y in z;L=lambda x:F(x,[25401,25402,25408]);
T=lambda x:F(x,(25401,25402,25408));import dis;dis.dis(L);dis.dis(T)
  1   0 LOAD_GLOBAL  0 (F)
  3 LOAD_FAST0 (x)
  6 LOAD_CONST   0 (25401)
  9 LOAD_CONST   1 (25402)
 12 LOAD_CONST   2 (25408)
 15 BUILD_LIST   3
 18 CALL_FUNCTION2
 21 RETURN_VALUE
  1   0 LOAD_GLOBAL  0 (F)
  3 LOAD_FAST0 (x)
  6 LOAD_CONST   3 ((25401, 25402, 25408))
  9 CALL_FUNCTION2
 12 RETURN_VALUE

So in general anywhere I had a "list constant" I'd make
it a tuple -- I'm not aware of any way that performance
gets worse by doing that, and it can get better.

Background: I'm supporting packages that run on 2.1 to 2.6
in one case and 2.4 to 2.6 in the other; every little
unobtrusive tweak helps :-)

HTH,
John

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-08 Thread Scott David Daniels

m...@pixar.com wrote:

Is there any reason to prefer one or the other of these statements?
if e.message.code in [25401,25402,25408]:
if e.message.code in (25401,25402,25408):
I'm currently using [], but only coz I think it's prettier
than ().
context: these are database errors and e is database exception,
so there's probably been zillions of instructions and io's
handling that already.


I lightly prefer the (a, b, c) -- you do put spaces after the comma,
don't you?  A tuple can be kept as a constant, but it requires (not
very heavy) program analysis to determine that the list need not be
constructed each time the statement is executed.  In addition, a
tuple is allocated as a single block, while a list is a pair of
allocations.

The cost is tiny, however, and your sense of aesthetics is part of
your code.  So unless you only very slightly prefer brackets, if I
were you I'd go with the list form.

--Scott David Daniels
scott.dani...@acm.org
--
http://mail.python.org/mailman/listinfo/python-list


Re: preferring [] or () in list of error codes?

2009-06-08 Thread Stephen Hansen
On Mon, Jun 8, 2009 at 2:36 PM,  wrote:

> Is there any reason to prefer one or the other of these statements?
>
>if e.message.code in [25401,25402,25408]:
>if e.message.code in (25401,25402,25408):
>
> I'm currently using [], but only coz I think it's prettier
> than ().


I like to use tuples / () if the sequence literal is ultimately static.
Purely because in my mind that just makes it a little more clear-- a list is
mutable, so I use it when it should be or may be mutated; if it never would,
I use a tuple. It just seems clearer to me that way.

But a tuple also takes up a little space in memory, so it's a bit more
efficient that way. I have absolutely no idea if reading / checking for
contents in a list vs tuple has any performance difference, but would
suspect it'd be tiny (and probably irrelevant in a small case like that),
but still.

--S
-- 
http://mail.python.org/mailman/listinfo/python-list


preferring [] or () in list of error codes?

2009-06-08 Thread mh
Is there any reason to prefer one or the other of these statements?

if e.message.code in [25401,25402,25408]:
if e.message.code in (25401,25402,25408):

I'm currently using [], but only coz I think it's prettier
than ().

context: these are database errors and e is database exception,
so there's probably been zillions of instructions and io's
handling that already.

Many TIA!
Mark

-- 
Mark Harrison
Pixar Animation Studios
-- 
http://mail.python.org/mailman/listinfo/python-list