Hi Alf,
Before I start, note we're talking about semantics, not implementation. That distinction is very important. On Feb 11, 4:49 am, "Alf P. Steinbach" <al...@start.no> wrote: > > *The* standard general language independent definition? [ of pointer ] > Yes. > > > As defined where? > > For example, as I used as reference in my first posting, the Java language > spec. > But it has nothing specifically to do with Java. It is a basic concept in > computer science, that (most) CS students learn in their *first year*. > > E.g. > > <quote src="http://cslibrary.stanford.edu/106/"> > A pointer stores a reference to something. Unfortunately there is no fixed > term > for the thing that the pointer points to, and across different computer > languages there is a wide variety of things that pointers point to. We use the > term pointee for the thing that the pointer points to, and we stick to the > basic > properties of the pointer/pointee relationship which are true in all > languages. > The term "reference" means pretty much the same thing as "pointer" -- > "reference" implies a more high-level discussion, while "pointer" implies the > traditional compiled language implementation of pointers as addresses. For the > basic pointer/pointee rules covered here, the terms are effectively > equivalent. > </quote> This is where you have gone wrong. You have taken a first year undergraduate academic generalisation and assumed that it applies to The World. In theory, there is no difference between practice and theory, but in practice there is (so the saying goes). The World however has another place for defining terms. That place is of highly varying quality, but generally a better place to correct semantics of terms. Who knows, eventually there may be a single commonly accepted viewpoint. (Which would bring a whole new level of pedantry of course )-: I am referring to Wikipedia here. (this is a vague attempt at humour, rather than an attempt to patronise which it may also come over as) Let's look at the tip of the iceberg for that: "In computer science, a pointer is a programming language data type whose value refers directly to (or "points to") another value stored elsewhere in the computer memory using its address." http://en.wikipedia.org/wiki/Pointer_%28computing%29 Similarly for Call by Value: (which *is* a loaded term) In call-by-value, the argument expression is evaluated, and the resulting value is bound to the corresponding variable in the function (frequently by copying the value into a new memory region). If the function or procedure is able to assign values to its parameters, only its local copy is assigned — that is, anything passed into a function call is unchanged in the caller's scope when the function returns. http://en.wikipedia.org/wiki/Evaluation_strategy#Call_by_value Call by Reference: (again, loaded term): In call-by-reference evaluation (also referred to as pass-by- reference), a function receives an implicit reference to the argument, rather than a copy of its value. This typically means that the function can modify the argument- something that will be seen by its caller. http://en.wikipedia.org/wiki/Evaluation_strategy#Call_by_reference Call by Sharing: (far less common term) Also known as "call by object" or "call by object-sharing" is an evaluation strategy first named by Barbara Liskov et al. for the language CLU in 1974[1]. It is used by languages such as Python[2], Iota, Java (for object references)[3], Ruby, Scheme, OCaml, AppleScript, and many other languages. The semantics of call-by-sharing differ from call-by-reference in that assignments to function arguments within the function aren't visible to the caller (unlike by-reference semantics). http://en.wikipedia.org/wiki/Evaluation_strategy#Call_by_sharing As you can see, there are generally accepted terms and definitions here, and python is accepted as falling not into the value or reference camp, along with some other languages. Understanding why IMO comes back to some basic aspects of python which I believe trip up experienced developers. This claim is based on talking to someone experienced in coding for a couple of decades, who has done a CS degree (like me), and just didn't understand why I would use python. I spent an couple of _days_ explaining this, along with python's evaluation model, and at the end we ended up where we started: * Python is a language which is focussed on programmer performance, not machine performance, because relative to programmer cost, machines are cheap. Therefore you focus your language to optimise for the programmer, not the machine. In this case, let's drop back to the word "pointer" which I can understand that you like. Indeed, it took me a fair while to let go of the word when talking about python, but you do have to. Why? Well, assume this definition isn't bad: "In computer science, a pointer is a programming language data type whose value refers directly to (or "points to") another value stored elsewhere in the computer memory using its address." OK, let's assume that we can generalise this to ignore the address bit, like so: "In computer science, a pointer is a programming language data type whose value refers directly to (or "points to") another value stored elsewhere in the computer memory" That's still not valid for python - why? Let's keep trimming it back: "A pointer is a programming language data type whose value refers directly to another value" Seems OK. But not valid for python. Why? Let's keep going. "A pointer is a data type whose value" Ah, maybe that is the reason. Let's look at the canonical kind of way of describing what's happening here: >>> x = [1,2,3] >>> def hello(world): ... print world ... >>> hello(x) [1, 2, 3] This is actually alot more complicated to explain than it might seem if someone starts thinking "what's going on underneath". Let's take the description you'd give to a beginner first. (OK, depends on beginner) Beginner description -------------------- A list object is created and initialised containing 3 integers -1,2 & 3. The name x is bound to this list object. Next a function - hello - is defined which takes one argument. When called, the name world is bound to the object passed in. The function body then runs and the object bound to world is used by print to print whatever str(world) returns. The function hello is then called with the value bound to the name x. OK, that seems pretty simple and clear. Little odd in the terminology perhaps, but it's clear. We deal with names and objects. The less beginner explanation ----------------------------- >>> x = [1,2,3] 3 anonymous integer objects are created with the values 1,2 and 3. These are bound inside a sequence which when iterated over yields the 3 integer objects. list is then called with this sequence object (of whatever kind it is under the hood). A list object is then created such that: x[0] appears bound to the integer object 1 x[1] appears bound to the integer object 2 x[2] appears bound to the integer object 3 Note I do NOT say that x[0] contains a reference to object 1, because in languages that contain references you would: a) have access to the reference b) have a means of dereferencing the reference c) Have to dereference the reference in order to obtain the value. This is because a reference is a form of l-value and r-value. It is a value in itself, and refers to a value. Python does not have such a beast. (The closest you can get is a class property, but I digress) In python, the language, we do not have a) or b) nor have to do c). (indeed cannot do "c") Why do I say "appears bound" ? Because x[0] is syntactic sugar to a function call - x.__getitem__(0) So... This results in a list object being created allowing access to 3 integer objects. The name "x" is bound to this list object. >>> def hello(world): ... print world ... This statement (not definition) creates a function object which accepts one argument which will be and object labelled "world" inside the scope of the function. The resulting function object is bound to the name "hello" in the same scope as "x". The resulting function object also has a bunch of other attributes, which I won't go into. >>> hello(x) [1, 2, 3] This is quite complex if you go down this layer. hello is a name bound to the function object we created earlier. Python effectivelly does this: >>> getattr(hello, "__call__") And uses that as the method to call with the arguments provided. Specifically, the object that x is bound to is passed as the sole argument to the __call__ method (wrapper) that getattr(hello, "__call__") returned. The object, not a reference. The object, not a pointer. This is because python's semantics are designed for humans not machines. When applying the object we humans call x to the function object we humans call hello ... >>> def hello(world): ... print world ...inside the scope of the function object, the name world is bound to the object that the calling scope calls x. This may be *implemented* in one runtime using references in one language. It may be *implemented* in another runtime using pointers in another. In PyPy the implementation will be to pass the object by sharing. (I suspect, I've not looked at PyPy's internal's but that's what I'd expect) Anyhow, by hook or by crook, inside hello, the name world is now bound to our previously created list object, which itself has made it such that we can retrieve the 3 integers we asked it to. The print statement evaluates str(world), which evaluates world.__str__(), and passes the resulting string object to the subsystem that actually spits characters out of stdout. That's a significantly more complicated explanation, but still does not need or use the terms pointers or references. The object itself is passed as the argument. The fact that an object may have many names is by the by. Now, you may be asking yourself (indeed I'm certain you may be) "But how is that any difference, really, from pointers and references". Indeed it will seem odd to hear clearly well educated people arguing against you saying "no, they're not pointers or references". It seems odd, primarily because in languages where you have pointers and references, pointers and references are in themselves values. Specifically, in such languages you start talking about l-values and r-values. An r-value is a value that may be assigned to an l-value. An l-value essentially thus denotes storage - a destination for an r-value. An l-values is a machine oriented concept. A name is a human oriented concept. Python doesn't have l-values. It has names, and r-values. A name is just a name, and has no value. When the name is used syntactically, the value (object) it is bound to is used in its place. Thus using your example "demonstrating" that references exist in python is something you're not yourself understanding correctly. Rather than thinking of names as boxes which can contain values, think of them as labels (post it notes) that can be stuck onto values/objects. If you do this, it will simplify your thinking. This was your wording: s = ["A"] t = s # Copies the reference. t[0] = "B" # Changes the referenced object via the reference copy. print( s ) # Inspects object (now changed) via original reference. This is mine: s = ["A"] A list object is created such that the string object "A" can be retrieved from it. The label "s" is stuck on the object. t = s Sticks the label "t" onto the object as well. t[0] = "B" Applies the __setitem__ method of the list object we've created with the integer object 0 and string object "B". (What this does is anyone's guess really, someone could have changed the implementation of list before these 4 lines were executed) print( s ) print takes the list object which s is stuck onto. It then calls the __str__ method of the list object passed it, gets back a string object and spits that out stdout. Whether we chose to use the label s or t is irrelevant, the object _itself_ is passed in. (in terms of semantics) Again, note, we're talking about semantics, not implementation. That distinction is very important. Once again, you might say, "but my explanation works well with references and pointers", and yes in terms of implementation, those things will exist. But if you confuse the implementation as the semantics, you run into problems. For example, going back to your (incorrect) explanation: s = ["A"] t = s # Copies the reference. t[0] = "B" # Changes the referenced object via the reference copy. print( s ) # Inspects object (now changed) via original reference. If t or s are references or pointers, I should be able to deal with the reference or pointer itself as a value. After all accepted definitions of pointer boil down to: "A pointer is a programming language data type whose value refers directly to another value" And similarly for references: "Generally, a reference is a value that enables a program to directly access the particular data item." http://en.wikipedia.org/wiki/Reference#Computer_science ie both pointers and references are themselves values that enable access to the value AND are NOT in themselves the value they point/refer to. In python, if I do this: def hello(world): print world X = [1,2,3] hello(X) Then "X" is best described as a name bound to the object [1,2,3] hello(X) is best described as calling the function bound to the name "hello" with a single argument being the object that the name "X" is bound to. Inside hello, "world" is a name that is bound to whatever object gets slung inside as its first argument. Not a pointer. Not a reference. The object. This semantics is radically different from languages like C, Pascal, and similar. One final approach. Suppose I do this: a = (1,2,3) b = (4,5,6) c = (7,8,9) d = [a, b, c] And I claim that a, b and c are references. (as per your explanation) This would mean that d[2] must also be the same reference as c. (if you take d[2] to be a reference - which it isn't) Therefore updating d[2] must update c. That could imply that d[2] += (10,) Should result in c's reference referring to the value (7,8,9,10) And with D having the value [ (1,2,3), (4,5,6), (7,8,9,10) ] This is perfectly logical semantics if you're dealing with references and pointers. However, python doesn't deal with them. It deals with objects and names. D does indeed result with that value, but c is left unchanged. The reason is because python only has r-values and names, and no l-values. d[2] += (10,) Results in d[2] being evaluated - that is the __getitem__ method on the object d is bound to is called with the argument 2. This returns the tuple object (7,8,9). The __add__ method of that tuple object is then called, with the argument (10,) This returns a new tuple object (7,8,9,10) Finalled the __setitem__ method of the object d is bound to is called with the argument 2 and the freshly minted tuple object (7,8,9,10). This results in c being left unchanged. This results in D having the correct value. If we were dealing with references and pointers it would be perfectly acceptable for c to be modified. In python, the label "c" cannot be tacked onto the returned value. Crucially, a pointer can be a pointer to a pointer, and you can have references to references (which allows you to dereference the pointer/reference and change the pointer/ reference). You can't do this in python names directly. (You can come close using properties, and by messing around in globals and locals, but that's getting quite nasty, and not really the same semantics.) Anyway, I hope you've not found that patronising or ad-hominem. If you've found my explanation overly simplistic or using too simple phrasing, please excuse that - it's a style of writing I've picked up on a writing course once. In summary, if you claim that there's references or pointers here... s = ["A"] t = s # Copies the reference. t[0] = "B" # Changes the referenced object via the reference copy. print( s ) # Inspects object (now changed) via original reference. ... since references and pointers are values themselves, not the values they refer to, how can I store a reference to a reference? That is update the value of s from a function? Specifically, if python has pointers or references, this should be doable: s = 1 # Claim is that s is a reference to 1 update(s, 2) # Update the reference s to point at the value 2 How do I write the function "update" used above in pure python. (without messing with globals(), locals(), or diving into any other language implementation specific concepts) After all, I can write that in any other language that has pointers & references. Finally, note, we're talking about semantics, not implementation. That distinction is very important. If this seems odd, the reason is because python's semantics are very much more like those of a functional language when it comes to function calls, than it is to standard imperative languages. Python's functions are all impure however, and side effects are common, due to objects not (all) being immutable. Your experience here is leading you to the wrong conclusions and incorrect reasoning. Regards, Michael -- http://www.kamaelia.org/ -- http://mail.python.org/mailman/listinfo/python-list