On 11/06/2012 05:55 PM, Steven D'Aprano wrote:
On Tue, 06 Nov 2012 14:41:24 -0800, Andrew Robinson wrote:

Yes.  But this isn't going to cost any more time than figuring out
whether or not the list multiplication is going to cause quirks, itself.
  Human psychology *tends* (it's a FAQ!) to automatically assume the
purpose of the list multiplication is to pre-allocate memory for the
equivalent (using lists) of a multi-dimensional array.  Note the OP even
said "4d array".
I'm not entirely sure what your point is here. The OP screwed up -- he
didn't generate a 4-dimensional array. He generated a 2-dimensional
array. If his intuition about the number of dimensions is so poor, why
should his intuition about list multiplication be treated as sacrosanct?
Yes he did screw up.
There is a great deal of value in studying how people screw up, and designing interfaces which tend to discourage it. "Candy machine interfaces".

As they say, the only truly intuitive interface is the nipple.
No it's not -- that interface really sucks.  :)
Have you ever seen a cat trying to suck a human nipple -- ?
Or, have you ever asked a young child who was weaned early and doesn't remember nursing -- what a breast is for ? Once the oral stage is left, remaining behavior must be re-learned.
  There are
many places where people's intuition about programming fail. And many
places where Fred's intuition is the opposite of Barney's intuition.
OK. But that doesn't mean that *all* places have opposite intuition; Nor does it mean that one intuition which is statistically *always* wrong shouldn't be discouraged, or re-routed into useful behavior.

Take the candy machine, if the items being sold are listed by number -- and the prices are also numbers; it's very easy to type in the price instead of the object number because one *forgets* that the numbers have different meaning and the machine can't always tell from the price, which object a person wanted (duplicate prices...); Hence a common mistake... people get the wrong item, by typing in the price.

By merely avoiding a numeric keypad -- the user is re-routed into choosing the correct item by not being able to make the mistake.

For this reason, Python tends to *like* things such as named parameters and occasionally enforces their use. etc.

Even more exciting, there are places where people's intuition is
*inconsistent*, where they expect a line of code to behave differently
depending on their intention, rather than on the code. And intuition is
often sub-optimal: e.g. isn't it intuitively obvious that "42" + 1 should
give 43? (Unless it is intuitively obvious that it should give 421.)
I agree, and in places where an *exception* can be raised; it's appropriate to do so.
Ambiguity, like the candy machine, is *bad*.

So while I prefer intuitively obvious behaviour where possible, it is not
the holy grail, and I am quite happy to give it up.
"where possible"; OK, fine -- I agree. I'm not "happy" to give it up; but I am willing. I don't like the man hours wasted on ambiguous behavior; and I don't ever think that should make someone "happy".

The OP's original construction was simple, elegant, easy to read and
very commonly done by newbies learning the language because it's
*intuitive*.  His second try was still intuitive, but less easy to read,
and not as elegant.
Yes. And list multiplication is one of those areas where intuition is
suboptimal -- it produces a worse outcome overall, even if one minor use-
case gets a better outcome.

I'm not disputing that [[0]*n]*m is intuitively obvious and easy. I'm
disputing that this matters. Python would be worse off if list
multiplication behaved intuitively.
How would it be worse off?

I can agree, for example, that in "C" -- realloc -- is too general.
One can't look at the line where realloc is being used, and decide if it is:
1) mallocing
2) deleting
3) resizing

Number (3) is the only non-redundant behavior the function provides.
There is, perhaps, a very clear reason that I haven't discovered why the extra functionality in list multiplication would be bad; That reason is *not* because list multiplication is unable to solve all the copying problems in the word; (realloc is bad, precisely because of that); But a function ought to do at least *one* thing well.

Draw up some use cases for the multiplication operator (I'm calling on your experience, let's not trust mine, right?); What are all the Typical ways people *Do* to use it now?

If those use cases do not *primarily* center around *wanting* an effect explicitly caused by reference duplication -- then it may be better to abolish list multiplication all together; and rather, improve the list comprehensions to overcome the memory, clarity, and speed pitfalls in the most common case of initializing a list.

For example, in initialization use cases; often the variable of a for loop isn't needed and all the initializers have parameters which only need to be evaluated *once* (no side effects).

Hence, there is an opportunity for speed and memory gains,while maintaining clarity and *consistency*.

Some ideas of use cases:
[ (0) in xrange(10) ] # The function to create a tuple cache's the parameter '0', makes 10 (0)'s [ dict.__new__(dict) in xrange(10) ] # dict.__new__, The dict parameter is cached -- makes 10 dicts. [ lambda x:(0) in xrange(10) ] # lambda caches (0), returns a *reference* to it multiple times.


An analogy: the intuitively obvious thing to do with a screw is to bang
it in with a hammer. It's long, thin, has a point at the end, and a flat
head that just screams "hit me". But if you do the intuitive thing, your
carpentry will be *much worse* than the alternatives.
:)
I agree.  Good point and Good "thin point".

Having list multiplication copy has consequences beyond 2D arrays. Those
consequences make the intuitive behaviour you are requesting a negative
rather than a positive. If that means that newbie programmers have to
learn not to hammer screws in, so be it. It might be harder, slower, and
less elegant to drill a pilot hole and then screw the screw in, but the
overall result is better.
no, the overall result is still bad. If the answer is *don't* hammer nails, then it's better to raise an exception when it's tried. There's no way to do that with list multiplication.

* Consistency of semantics is better than a plethora of special
    cases. Python has a very simple and useful rule: objects should not
    be copied unless explicitly requested to be copied. This is much
    better than having to remember whether this operation or that
    operation makes a copy. The answer is consistent:
Bull.  Even in the last thread I noted the range() object produces
special cases.
  >>>  range(0,5)[1]
1
  >>>  range(0,5)[1:3]
range(1, 3)
What's the special case here? What do you think is copied?


You take a slice of a range object, you get a new range object.
You were'nt paying attention, OCCASIONALLY, get an integer, or a list.
>>> range(3)[2]
2

LOOOOK! That's not a range object, that's an integer. Use Python 3.2 and try it.

I'm honestly not getting what you think is inconsistent about this.
How about now?

Two-dimensional arrays in Python using lists are quite rare. Anyone who is doing serious numeric work where they need 2D arrays is using numpy, not lists.
Game programmers routinely use 2D lists to represent the screen layout;
For example, they might use 'b' to represent a brick tile, and 'w' to represent a water tile. This is quite common in simple games; I have seen several use 2D lists (or tuples) to do this. Serious numeric work is not needed in most simple games; especially if motion is not involved.

There are *many* non serious uses of matrix mathematics and 2D lists. Numpy isn't desired even if it would work. Cost benefit analysis....

Crossword puzzles, periodic table of the elements with different sources of weights listed under each element, etc. (That can also be done with a dict, but I've seen an implementation do it the other way.) etc.

There are millions of people using Python, so it's hardly surprising that once or twice a year some newbie trips over this. But it's not something that people tend to trip over again and again and again, like C's "assignment is an expression" misfeature.
Good point. I don't have a statistic -- except the handful of times I searched for some other topic -- and I have seen it three times already....

I read some of the documentation on why Python 3 chose to implement it
this way.
What documentation is this?
The documentation for range() -- which I just studied read because of another thread we both were in. You're misconstruing the subject -- which was "inconsistency" of Python is allowed; not "is list multiplication inconsistent."
      Q: What about [[]]*10?
      A: No, the elements are never copied.

YES! For the obvious reason that such a construction is making mutable
lists that the user wants to populate later.  If they *didn't* want to
populate them later, they ought to have used tuples -- which take less
overhead.  Who even does this thing you are suggesting?!
Who knows? Who cares? Nobody does:
exactly !!!! But I do care, even though I don't do it (because it doesn't *work*)


n -= n

instead of just n=0, but that doesn't mean that we should give it some
sort of special meaning different from n -= m. If it turns out that the
definition of list multiplication is such that NOBODY, EVER, uses [[]]*n,
that is *still* not a good reason for special-casing it.
Ahh... but some people *DO* try to use it for another purpose.
Your example is a bad analogy.


Special cases aren't special enough to break the rules.
finish the sentence: ALTHOUGH practicality beats purity.

There are perfectly good ways to generate a 2D array out of lists, and
even better reasons not to use lists for that in the first place. (Numpy
arrays are much better suited for serious work.)
Duh... I answered that....


I'm afraid you've just lost an awful lot of credibility there.

py>  x = [{}]*5
py>  x
[{}, {}, {}, {}, {}]
No, I showed what happed when you do {}*3;
That *DOESN'T* work; You aren't multiplying the dictionary, you are multiplying the LIST of dictionaries. Very different things. You were complaining that my method doesn't multiply them -- well, gee -- either mine DOES or python DOESN'T. Double standards are *crap*.


py>  x[0]['key'] = 1
py>  x
[{'key': 1}, {'key': 1}, {'key': 1}, {'key': 1}, {'key': 1}]

And similarly for any other mutable object.

If you don't understand that lists can contain other mutable objects
apart from lists, then you really shouldn't be discussing this issue.

I do; that's why I DEMONSTRATED this issue in my own replies.


Your proposal throws away consistency for a trivial benefit on a rare
use- case, and replaces it with a bunch of special cases:
RARE!!!! You are NUTS!!!!
Yes, rare. I base that on about 15 years of Python coding and many
thousands (tens of thousands?) of hours on Python forums like this one.
What's your opinion based on?
Which opinion?

2D lists are NOT rare; I've seen them in dozens of python programs not written by me.

As to my other opinion regarding why change it, there are two separate issues:

One was that a poster asked if it would be difficult to do without introducing bugs; That's the question I answered affirmatively. The _OP_ problem can be removed using a simple fix which isn't going to break any major number of programs. That's a fact by your admission as well.

The second issue is would it be consistent to make the change (and NOTE I wasn't asked that, and wasn't answering that) You and others brought it up tangentially. I agree, There is a problem, as I noted, with subclassing. I also will note that ([],[],[]) is effectively the same question as sub-classing -- () are merely lists that are immutable.

As to how often people make the mistake -- There's a big difference between how often someone makes the mistake, and how often it shows up on a forum. The issue shows up in a forum when someone can't figure out what they did wrong after a long debugging session.

There is more than one way to resolve such an issue, even if a person doesn't know why their construction is wrong. If it is easy to see the construction produces the wrong result, one can simply try list comprehensions which will work correctly. Then, either the person learns why they made the mistake -- or they don't. If they are able to get it to work sometimes, but sometimes they can't -- they may or not stop using it. I cite Java programming where the API is notorious for that kind of inconsistent behavior; Yet Java programmers feel compelled to still use those constructions. etc.



values = [None]*n  # or 0 is another popular starting value

Using it twice to generate a 2D array is even rarer.
Sure, and if it is used that way -- I doubt it is ever used like [ {} ]*n, because that object will have a side effect. SO again, if this is the *main* use case, the default behavior is not the main reason they use it -- but they can often work around the default behavior by using it with specially thought about data.


      Q: How about if I use delegation to proxy a list? A: Oh no, they
      definitely won't be copied.
Give an example usage of why someone would want to do this.  Then we can
discuss it.
Proxying objects is hardly a rare scenario. Delegation is less common
since you can subclass built-ins, but it is still used. It is a standard
design pattern.
I was not judging you; I was asking for an example to discuss.

Python is a twenty year old language. Do you really think this is the first time somebody has noticed it? It's hard to search for discussions on the dev list, because the obvious search terms bring up many false positives.
No, I don't think it's the first time.

But here are a couple of bug reports closed as "won't fix": http://bugs.python.org/issue1408 http://bugs.python.org/issue12597 I suspect it is long past time for a PEP so this can be rejected once and for all.
Yeah, that'd be good -- and perhaps they'd abolish it. :)
The copy speed will be the same or *faster*, and the typing less -- and
the psychological mistakes *less*, the elegance more.
You think that it is *faster* to copy a list than to make a new pointer
to it? Your credibility is not looking too good here.
YES -- WHEN copying by reference is a BUG, then copying is NOT by reference.
That's the use case I spelled out. You're changing the subject to make me look dumb? or being purposely facile to hide being destroyed on the substance of the argument ?

When true copying is desired, then doing it at the "C" level is better than the interpreter Level.
ergo: You're not looking too intelligent either.

It's hardly going to confuse anyone to say that lists are copied with
list multiplication, but the elements are not.
Well, that confuses me. What about a list where the elements are lists?
Are they copied?
YES! It's a LIST copy; the all lists, and only the lists, are copied; the rest are referenced.
What about other mutable objects? Are they copied?
No.
It's a list copy, not random mutable object copy.
What about mutable objects which are uncopyable, like file objects?
No.
It's a list copy, not a file copy.

Every time someone passes a list to a function, they *know* that the
list is passed by value -- and the elements are passed by reference.
And there goes the last of your credibility. *You* might "know" this, but
that doesn't make it so.
No, it's not gone.  You saying so, doesn't make it gone.

People aren't ignorant of the passing mechanism just because they didn't transfer from another language like "C", or "Pascal", "ADA", "Fortran", etc. But when they do transfer from a language which makes a distinction -- then, yes, it's weird.

Python's calling behaviour is identical to that used by languages including Java (excluding unboxed primitives) and Ruby, to mention only two. You're starting to shout and yell, so perhaps it's best if I finish this here.
Huh?
I'm not yelling any more than you are.  Are ???YOU??? yelling?

:-\

-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to