Re: What is a type error?
Darren New wrote: Chris Smith wrote: Specialized language already exist that reliably (as in, all the time) move array bounds checking to compile time; It sounds like this means the programmer has to code up what it means to index off an array, yes? Otherwise, I can't imagine how it would work. x := read_integer_from_stdin(); write_to_stdout(myarray[x]); What does the programmer have to do to implement this semantic in the sort of language you're talking about? Surely something somewhere along the line has to fail (for some meaning of failure) at run-time, yes? You should really take a look at Epigram. One of its main features is that it makes it possible not only to statically /check/ invariants, but also to /establish/ them. In your example, of course the program has to check the integer at runtime. However, in a dependent type system, the type of the value returned from the check-function can very well indicate whether it passed the test (i.e. being a valid index for myarray) or not. Ben -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Joachim Durchholz wrote: Marshall schrieb: Chris Smith wrote: Joachim Durchholz [EMAIL PROTECTED] wrote: I *think* I understand Marshall here. When you are saying assignment, you mean assignment to values of attributes within tuples of the cell. When Marshall is saying assignment, he seems to mean assigning a completely new *table* value to a relation; i.e., wiping out the entire contents of the relation and replacing it with a whole new set of tuples. Your assignment is indeed less powerful than DML, whereas Marshall's assignment is more powerful than DML. Exactly. Ah well. I never meant that kind of assignment. Sorry for the confusion. Marshall -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall schrieb: Chris Smith wrote: Joachim Durchholz [EMAIL PROTECTED] wrote: I *think* I understand Marshall here. When you are saying assignment, you mean assignment to values of attributes within tuples of the cell. When Marshall is saying assignment, he seems to mean assigning a completely new *table* value to a relation; i.e., wiping out the entire contents of the relation and replacing it with a whole new set of tuples. Your assignment is indeed less powerful than DML, whereas Marshall's assignment is more powerful than DML. Exactly. Ah well. I never meant that kind of assignment. Besides, all the aliasing possibilities already happen at the field level, so it never occurred to me that whole-table assignment might enter into the picture. Regards, Jo -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall schrieb: No. The variable is the table, not the records. In your examples, yes. Relations are not arrays. They are, in all ways that matter for aliasing: They are a collection of mutable data, accessible via selector values. Records are not lvalues. Agreed, but they have identity. An lvalue is an identity that you can track in software; record identity is untrackable in standard SQL, but it's real enough to create aliasing problems. I would propose that variables have identity, and values do not. What's a variable, by that definition? It can't be something that has a name in a program, since that would exclude variables on the heap. Another definition would be something that may have a different value if I look at it sometime later, but then that would include SQL records as well. Besides, the temporal aspect is problematic, since it's unclear whether the it some time later is still the same as the it before (if you have a copying garbage collector, you'd have to somehow establish identity over time, and we're full circle back at defining identity). Personally, I say two things are identical / have the same identity, if they are (a) equal and (b) stay equal after changing just one of them. That's a nice local definition, and captures exactly those cases where aliasing is relevant in the first place (it's actually that changing here also changes there aspect that makes aliasing so dangerous). It also covers SQL records. Well, they do have aliasing problems, or problems that are similar enough to them, so I'd say this is a surprising but actually interesting and useful result of having that definition. In part this is via the supplied definition of identity, in which, when you change one thing, if something else changes as well, they share identity. Since one cannot change values, they necessarily lack identity. I agree that immutable values don't have identity (or, rather, that identity isn't an interesting concept for them since it's already subsumed under equality). It's certainly not storage location: if you DELETE a record and then INSERT it with the same values, it may be allocated somewhere entirely else, and our intuition would say it's not the same (i.e. not identical). Well, it would depend on how our intuition had been primed. If it was via implementation techniques in low level languages, we might reach a different conclusion than if our intuition was primed via logical models and relation theory. Sure. However, since I'm interested in practical consequences for programming, I'm looking for definitions that allow me to capture those effects that create bugs. In the case of identity, I'd say that's aliasing. Since intuition gives me ambivalent results here, I'll go back to my semiformal definition (and take the opportunity to make it a bit more precise): Two path expressions (lvalues, ...) are aliases if and only if the referred-to values compare equal, and if they stay equal after applying any operation to the referred-to value through either of the path expressions. Alas, this leaves me confused. I don't see how a path expression (in this case, SELECT ... WHERE) can be an l-value. You cannot apply imperative operations to the result. It's not SELECT...WHERE..., it's WHERE... that's an lvalue. And you can indeed use it with UPDATE, so yes it's an lvalue by my book. (A given WHERE clause may become invalid or refer to a different record after some updates, so it's not the kind of lvalue that you have in a C program.) (Also I think the use of equality here is too narrow; it is only necessary to show that two things both change, not that they change in the same way.) Sorry for using misleading terminology. An alias is when two names refer to an identical object (both in real life and IMHO in programming), and that's what I wrote. Aliasing is the kind of problems that you describe. I was under the impression you agred that i+2 was not a path expression. Yes, but [i+2] is one. In the same vein, name = 'John' is an expression, WHERE name = 'John' is a path expression. If our hypothetical language lacks record identity, then I would say that any query is simply an expression that returns a value, as in i+2. As long as the data is immutable, yes. The question is how equality behaves under mutation. On the other hand, variable addresses as tangible identities don't hold much water anyway. Imagine data that's written out to disk at program end, and read back in. Further imagine that while the data is read into main memory, there's a mechanism that redirects all further reads and writes to the file into the read-in copy in memory, i.e. whenever any program changes the data, all other programs see the change, too. Alternatively, think about software agents that move from machine to machine, carrying their data with them. They might be communicating with each other, so they need some means of
Re: What is a type error? [correction]
David Hopwood wrote: Darren New wrote: David Hopwood wrote: public class LoopInitTest { public static String getString() { return foo; } public static void main(String[] args) { String line = getString(); boolean is_last = false; while (!is_last) { if (line.charAt(0) == 'q') { is_last = true; } // insert line into inputs (not important for analysis) if (!is_last) { line = getString(); } } } } which compiles without error, because is_last is definitely initialized. At what point do you think is_last or line would seem to not be initialized? They're both set at the start of the function, and (given that it's Java) nothing can unset them. At the start of the while loop, it's initialized. At the end of the while loop, it's initialized. So the merge point of the while loop has it marked as initialized. Apparently, Hermes (at least the version of it described in that paper) essentially forgets that is_last has been initialized at the top of the loop, and so when it does the merge, it is merging 'not necessarily initialized' with 'initialized'. No, it's not that it forgets. It's that the insert line into inputs unitializes line. Hence, line is only conditionally set at the bottom of the loop, so it's only conditionally set at the top of the loop. This sounds like a pretty easy thing to fix to me (and maybe it was fixed later, since there are other papers on Hermes' typestate checking that I haven't read yet). You simply misunderstand the insert line into inputs semantics. Had that line actually been commented out in the Hermes code, the loop would have compiled without a problem. That said, in my experience, finding this sort of typestate error invariably led me to writing clearer code. boolean found_ending = false; while (!found_ending) { string line = getString(); if (line.charAt(0) == 'q') found_ending = true; else insert line into inputs; } It seems that's much clearer to me than the contorted version presented as the example. If you want it to work like the Java code, where you can still access the line variable after the loop, the rearrangement is trivial and transparent as well. -- Darren New / San Diego, CA, USA (PST) This octopus isn't tasty. Too many tentacles, not enough chops. -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error? [correction]
Darren New wrote: David Hopwood wrote: [...] Apparently, Hermes (at least the version of it described in that paper) essentially forgets that is_last has been initialized at the top of the loop, and so when it does the merge, it is merging 'not necessarily initialized' with 'initialized'. No, it's not that it forgets. It's that the insert line into inputs unitializes line. Hence, line is only conditionally set at the bottom of the loop, so it's only conditionally set at the top of the loop. This sounds like a pretty easy thing to fix to me (and maybe it was fixed later, since there are other papers on Hermes' typestate checking that I haven't read yet). You simply misunderstand the insert line into inputs semantics. Yes, you're right, I did misunderstand this. Had that line actually been commented out in the Hermes code, the loop would have compiled without a problem. That said, in my experience, finding this sort of typestate error invariably led me to writing clearer code. boolean found_ending = false; while (!found_ending) { string line = getString(); if (line.charAt(0) == 'q') found_ending = true; else insert line into inputs; } It seems that's much clearer to me than the contorted version presented as the example. If you want it to work like the Java code, where you can still access the line variable after the loop, the rearrangement is trivial and transparent as well. I agree. -- David Hopwood [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall schrieb: Joachim Durchholz wrote: Marshall schrieb: Good point. Perhaps I should have said relational algebra + variables with assignment. It is interesting to consider assignment vs. the more restricted update operators: insert, update, delete. Actually I see it the other way round: assignment is strictly less powerful than DML since it doesn't allow creating or destroying variables, while UPDATE does cover assignment to fields. Oh, my. Well, for all table variables T, there exists some pair of values v and v', such that we can transition the value of T from v to v' via assignment, but not by any single insert, update or delete. I fail to see an example that would support such a claim. On the other hand, UPDATE can assign any value to any field of any record, so it's doing exactly what an assignment does. INSERT/DELETE can create resp. destroy records, which is what new and delete operators would do. I must really be missing the point. Further, it is my understanding that your claim of row identity *depends* on the restricted nature of DML; if the only mutator operation is assignment, then there is definitely no record identity. Again, I fail to connect. I and others have given aliasing examples that use just SELECT and UPDATE. (However, it's usually new+assignment+delete vs. INSERT+UPDATE+DELETE, at which point there is not much of a difference.) I am not sure what this means. Delete can be expressed in terms of assignment. (As can insert and update.) INSERT cannot be expressed in terms of assignment. INSERT creates a new record; there's no way that assignment in a language like C can create a new data structure! The same goes for DELETE. (Assignment can also be expressed in terms of insert and delete.) Agreed. I also realize that this makes it a bit more difficult to nail down the nature of identity in a database. It's certainly not storage location: if you DELETE a record and then INSERT it with the same values, it may be allocated somewhere entirely else, and our intuition would say it's not the same (i.e. not identical). (In a system with OID, it would even be impossible to recreate such a record, since it would have a different OID. I'm not sure whether this makes OID systems better or worse at preserving identity, but that's just a side track.) Since intuition gives me ambivalent results here, I'll go back to my semiformal definition (and take the opportunity to make it a bit more precise): Two path expressions (lvalues, ...) are aliases if and only if the referred-to values compare equal, and if they stay equal after applying any operation to the referred-to value through either of the path expressions. In the context of SQL, this means that identity isn't the location where the data is stored. It's also not the values stored in the record - these may change, including key data. SQL record identity is local, it can be defined from one operation to the next, but there is no such thing as a global identity that one can memorize and look up years later, without looking at the intermediate states of the store. It's a gross concept, now that I think about it. Well, or at least rather alien for us programmers, who are used to taking the address of a variable to get a tangible identity that will stay stable over time. On the other hand, variable addresses as tangible identities don't hold much water anyway. Imagine data that's written out to disk at program end, and read back in. Further imagine that while the data is read into main memory, there's a mechanism that redirects all further reads and writes to the file into the read-in copy in memory, i.e. whenever any program changes the data, all other programs see the change, too. Alternatively, think about software agents that move from machine to machine, carrying their data with them. They might be communicating with each other, so they need some means of establishing identity (addressing) the memory buffers that they use for communication. I don't know what new would be in a value-semantics, relational world. It would be INSERT. Um, my idea of value semantics is associated with immutable values. SQL with INSERT/DELETE/UPDATE certainly doesn't match that definition. So by my definition, SQL doesn't have value semantics, by your definition, it would have value semantics but updates which are enough to create aliasing problems, so I'm not sure what point you're making here... Filters are just like array indexing: both select a subset of variables from a collection. I can't agree with this wording. A filter produces a collection value from a collection value. I don't see how variables enter in to it. A collection can consist of values or variables. And yes, I do think that WHERE is a selection over a bunch of variables - you can update records after all, so they are variables! They don't have a name, at least none which is guaranteed to be
Re: What is a type error?
Joachim Durchholz [EMAIL PROTECTED] wrote: +--- | INSERT cannot be expressed in terms of assignment. INSERT creates a new | record; there's no way that assignment in a language like C can create a | new data structure! The same goes for DELETE. +--- Well, what about malloc() free()? I mean, if your database is a simple linked list, then INSERT is just: ROW *p = malloc(sizeof *p); p-field1 = value1; p-field2 = value2; ... p-fieldN = valueN; database = cons(p, database); /* Common Lisp's CONS */ and DELETE is just: ROW *p = find_if(predicate_function, database); /* CL's FIND-IF */ database = delete(p, database); /* CL's DELETE */ free(p); [There are single-pass methods, of course, but...] -Rob - Rob Warnock [EMAIL PROTECTED] 627 26th Avenue URL:http://rpw3.org/ San Mateo, CA 94403 (650)572-2607 -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Chris Smith schrieb: David Hopwood [EMAIL PROTECTED] wrote: Chris Smith wrote: If checked by execution, yes. In which case, I am trying to get my head around how it's any more true to say that functional languages are compilable postconditions than to say the same of imperative languages. A postcondition must, by definition [*], be a (pure) assertion about the values that relevant variables will take after the execution of a subprogram. Okay, my objection was misplaced, then, as I misunderstood the original statement. I had understood it to mean that programs written in pure functional languages carry no additional information except for their postconditions. Oh, but that's indeed correct for pure functional languages. (Well, they *might* carry additional information anyway, but it's not a requirement to make it a programming language.) The answer that I have for your original objection may be partial, but here goes anyway: Postconditions cannot easily capture all side effects. E.g. assume there's a file-open routine but no way to test whether a given file handle was ever opened (callers are supposed to test the return code from the open() call). In an imperative language, that's a perfectly valid (though possibly not very good) library design. Now the postcondition would have to say something like from now on, read() and write() are valid on the filehandle that I returned. I know of no assertion language that can express such temporal relationships, and even if there is (I'm pretty sure there is), I'm rather sceptical that programmers would be able to write correct assertions, or correctly interpret them - temporal logic offers several traps for the unwary. (The informal postcondition above certainly isn't complete, since it doesn't cover close(); I shunned the effort to get a wording that would correctly cover sequences like open()-close()-open(), or open()-close()-close()-open().) Regards, Jo -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Rob Warnock schrieb: Joachim Durchholz [EMAIL PROTECTED] wrote: +--- | INSERT cannot be expressed in terms of assignment. INSERT creates a new | record; there's no way that assignment in a language like C can create a | new data structure! The same goes for DELETE. +--- Well, what about malloc() free()? These do exactly what assignment cannot do. The correspondence between SQL and C would be: INSERT - malloc() (create a new data structure) DELETE - free() (destroy a data structure) UPDATE - assignment (change data within a data structure) Regards, Jo -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Joachim Durchholz [EMAIL PROTECTED] wrote: I fail to see an example that would support such a claim. On the other hand, UPDATE can assign any value to any field of any record, so it's doing exactly what an assignment does. INSERT/DELETE can create resp. destroy records, which is what new and delete operators would do. I must really be missing the point. I *think* I understand Marshall here. When you are saying assignment, you mean assignment to values of attributes within tuples of the cell. When Marshall is saying assignment, he seems to mean assigning a completely new *table* value to a relation; i.e., wiping out the entire contents of the relation and replacing it with a whole new set of tuples. Your assignment is indeed less powerful than DML, whereas Marshall's assignment is more powerful than DML. -- Chris Smith - Lead Software Developer / Technical Trainer MindIQ Corporation -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Chris Smith wrote: Joachim Durchholz [EMAIL PROTECTED] wrote: I fail to see an example that would support such a claim. On the other hand, UPDATE can assign any value to any field of any record, so it's doing exactly what an assignment does. INSERT/DELETE can create resp. destroy records, which is what new and delete operators would do. I must really be missing the point. I *think* I understand Marshall here. When you are saying assignment, you mean assignment to values of attributes within tuples of the cell. When Marshall is saying assignment, he seems to mean assigning a completely new *table* value to a relation; i.e., wiping out the entire contents of the relation and replacing it with a whole new set of tuples. Your assignment is indeed less powerful than DML, whereas Marshall's assignment is more powerful than DML. Exactly. Marshall -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Joachim Durchholz wrote: Marshall schrieb: Joachim Durchholz wrote: Marshall schrieb: Good point. Perhaps I should have said relational algebra + variables with assignment. It is interesting to consider assignment vs. the more restricted update operators: insert, update, delete. Actually I see it the other way round: assignment is strictly less powerful than DML since it doesn't allow creating or destroying variables, while UPDATE does cover assignment to fields. Oh, my. Well, for all table variables T, there exists some pair of values v and v', such that we can transition the value of T from v to v' via assignment, but not by any single insert, update or delete. I fail to see an example that would support such a claim. variable T : unary relation of int T = { 1, 2, 3 }; // initialization T := { 4, 5 }; // assignment The above transition of the value of T cannot be be done by any one single insert, update or delete. Two would suffice, however. (In fact, any assignement can be modeled at a full delete followed by an insert of the new value.) On the other hand, UPDATE can assign any value to any field of any record, Yes. so it's doing exactly what an assignment does. No. The variable is the table, not the records. Relations are not arrays. Records are not lvalues. INSERT/DELETE can create resp. destroy records, which is what new and delete operators would do. I must really be missing the point. Further, it is my understanding that your claim of row identity *depends* on the restricted nature of DML; if the only mutator operation is assignment, then there is definitely no record identity. Again, I fail to connect. I and others have given aliasing examples that use just SELECT and UPDATE. Sure, but update's semantics are defined in a per-record way, which is consistent with record identity. Assignment's isn't. (However, it's usually new+assignment+delete vs. INSERT+UPDATE+DELETE, at which point there is not much of a difference.) I am not sure what this means. Delete can be expressed in terms of assignment. (As can insert and update.) INSERT cannot be expressed in terms of assignment. INSERT creates a new record; there's no way that assignment in a language like C can create a new data structure! The same goes for DELETE. I was intendind to be discussing a hypothetical relation-based language, so while I generally agree with you statement about C, I don't see how it applies. (Assignment can also be expressed in terms of insert and delete.) Agreed. I also realize that this makes it a bit more difficult to nail down the nature of identity in a database. I would propose that variables have identity, and values do not. In part this is via the supplied definition of identity, in which, when you change one thing, if something else changes as well, they share identity. Since one cannot change values, they necessarily lack identity. It's certainly not storage location: if you DELETE a record and then INSERT it with the same values, it may be allocated somewhere entirely else, and our intuition would say it's not the same (i.e. not identical). Well, it would depend on how our intuition had been primed. If it was via implementation techniques in low level languages, we might reach a different conclusion than if our intuition was primed via logical models and relation theory. (In a system with OID, it would even be impossible to recreate such a record, since it would have a different OID. I'm not sure whether this makes OID systems better or worse at preserving identity, but that's just a side track.) OIDs are something of a kludge, and they break set semantics. Since intuition gives me ambivalent results here, I'll go back to my semiformal definition (and take the opportunity to make it a bit more precise): Two path expressions (lvalues, ...) are aliases if and only if the referred-to values compare equal, and if they stay equal after applying any operation to the referred-to value through either of the path expressions. Alas, this leaves me confused. I don't see how a path expression (in this case, SELECT ... WHERE) can be an l-value. You cannot apply imperative operations to the result. (Also I think the use of equality here is too narrow; it is only necessary to show that two things both change, not that they change in the same way.) I was under the impression you agred that i+2 was not a path expression. If our hypothetical language lacks record identity, then I would say that any query is simply an expression that returns a value, as in i+2. In the context of SQL, this means that identity isn't the location where the data is stored. It's also not the values stored in the record - these may change, including key data. SQL record identity is local, it can be defined from one operation to the next, but there is no such thing as a global identity that one can memorize and look up years
Re: What is a type error?
Chris Smith wrote: Darren New [EMAIL PROTECTED] wrote: I'm not sure what linear or uniqueness typing is. It's typestate, and if I remember correctly the papers I read 10 years ago, the folks at TJWatson that invented Hermes also invented the concept of typestate. They at least claim to have coined the term. Coining the term is one thing, but I feel pretty confident that the idea was not invented in 1986 with the Hermes language, but rather far earlier. Yes. However, the guys who invented Hermes didn't come up with it out of the blue. It was around (in NIL - Network Implementation Language) for years beforehand. I read papers about these things in graduate school, but I don't know where my photocopies are. NIL was apparently quite successful, but a niche language, designed by IBM for programming IBM routers. Hermes was an attempt years later to take the same successful formula and turn it into a general-purpose programming system, which failed (I believe) for the same reason that a general purpose operating system that can't run C programs will fail. Perhaps they may have invented the concept of considering it any different from other applications of types, though. From what I can determine, the authors seem to imply that typestate is dataflow analysis modified in (at least) two ways: 1) When control flow joins, the new typestate is the intersection of typestates coming into the join, where as dataflow analysis doesn't guarantee that. (They imply they think dataflow analysis is allowed to say the variable might or might not be initialized here, while typestate would ensure the variable is uninitialized.) 2) The user has control over the typestate, so the user can (for exmaple) assert a variable is uninitialized at some point, and by doing so, make it so. How this differs from theoretical lambda types and all I couldn't say. What is being named here is the overcoming of a limitation that programming language designers imposed upon themselves, whether from not understanding the theoretical research or not believing it important, I don't know. I believe there's also a certain level of common-senseness needed to make a language even marginally popular. :-) While it's possible that there's really no difference between type and typestate at the theoretical level, I think most practical programmers would have trouble wrapping their head around that, just as programming in an entirely recursive pattern when one is used to looping can be disorienting. -- Darren New / San Diego, CA, USA (PST) This octopus isn't tasty. Too many tentacles, not enough chops. -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Joachim Durchholz wrote: of no assertion language that can express such temporal relationships, and even if there is (I'm pretty sure there is), I'm rather sceptical that programmers would be able to write correct assertions, or correctly interpret them - temporal logic offers several traps for the unwary. FWIW, this is exactly the area to which LOTOS (Language Of Temporal Orderering Specifications) is targetted at. It's essentially based on CSP, but somewhat extended. It's pretty straightforward to learn and understand, too. Some have even added realtime constraints to it. -- Darren New / San Diego, CA, USA (PST) This octopus isn't tasty. Too many tentacles, not enough chops. -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall [EMAIL PROTECTED] wrote: We seem to have slipped back from the hypothetical relation language with only assignement back to SQL. I missed the point where we started discussing such a language. I suspect it was while some of us were still operating under the misconception that you assignment to attributes of tuples, rather than to entire relations. I don't see how such a language (limited to assignment of entire relations) is particularly helpful to consider. If the relations are to be considered opaque, then there's clearly no aliasing going on. However, such a language doesn't seem to solve any actual problems. It appears to be nothing other than a toy language, with a fixed finite set of variables having only value semantics, no scope, etc. I assume that relational databases will have the equivalent of SQL's update statement; and if that's not the case, then I would need someone to explain how to accomplish the same goals in the new relational language; i.e. it would need some way of expressing transformations of relations, not just complete replacement of them with new relations that are assumed to appear out of thin air. -- Chris Smith - Lead Software Developer / Technical Trainer MindIQ Corporation -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall wrote: I would propose that variables have identity, and values do not. In part this is via the supplied definition of identity, in which, when you change one thing, if something else changes as well, they share identity. Maybe you gave a better definition the first time, but this one doesn't really hold up. of equality here is too narrow; it is only necessary to show that two things both change, not that they change in the same way.) If I change X, then Y[X] changes also. I don't think X is identical to Y or Y[X], nor is it an alias of either. I think that's where the equality comes into it. -- Darren New / San Diego, CA, USA (PST) This octopus isn't tasty. Too many tentacles, not enough chops. -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall wrote: I am having a hard time with this very broad definition of aliasing. How about this definition: Consider three variables, i, j, and k, and a functional equivalence predicate (EQUIVALENT(i, j) returns true if for every pure function F, F(i) = F(j)). Now suppose i and j are EQUIVALENT at some point, then a side effecting function G is invoked on k, after which i and j are no longer equivalent. Then there is aliasing. This is still a little awkward, but there are three main points: 1. Aliasing occurs between variables (named objects). 2. It is tied to the notion of equivalence. 3. You can detect it when a procedure that has no access to a value can nonetheless modify the value. In a call-by-value language, you cannot alias values directly, but if the values are aggregate data structures (like in Java), you may be able to modify a shared subcomponent. -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Chris Smith wrote: Marshall [EMAIL PROTECTED] wrote: We seem to have slipped back from the hypothetical relation language with only assignement back to SQL. [...] I don't see how such a language (limited to assignment of entire relations) is particularly helpful to consider. I find it useful to consider various language features together and individually, to see how these features interact. If nothing else, it furthers my understanding of how languages work. If the relations are to be considered opaque, then there's clearly no aliasing going on. Not certain I understand, but I think I agree. However, such a language doesn't seem to solve any actual problems. It appears to be nothing other than a toy language, with a fixed finite set of variables having only value semantics, no scope, etc. No, such a language is entirely useful, since relational assignment is *more* expressive than insert/update/delete. (How did you get no scope by the way? And fixed variables? All I said was to change ins/upd/del to assignment.) Also note one could fairly easily write a macro for ins/upd/del if one had assignement. (See below.) Now, there are some significant issues with trying to produce a performant implementation in a distributed environment with strictness. In fact I will go so far as to say it's impossible. However on a single machine I don't see any reason to think it would perform any worse, (although I haven't thought about it deeply.) I assume that relational databases will have the equivalent of SQL's update statement; and if that's not the case, then I would need someone to explain how to accomplish the same goals in the new relational language; i.e. it would need some way of expressing transformations of relations, not just complete replacement of them with new relations that are assumed to appear out of thin air. Consider: i := i + 1; Note that the new value of i didn't just appear out of thin air; it was in fact based on the previous value of i. Given: variable T : relation of r; update_fn : r - r // takes one record and returns new, updated record filter_fn : r - boolean // indicator function insert(T, new_rows) = { T := T union new_rows; } update(T, update_fn, filter_fn) = { transaction { let Tmp = filter(T, filter_fn); // select only those rows to update T := T minus Tmp; T := T union update_fn(Tmp); } } delete(T,filter_fn) = { T := T minus filter(T, filter_fn); } So we can define insert, update and delete in terms of relational assignment, relational subtraction, and relational union. Type checking details omitted. Marshall -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Darren New wrote: From what I can determine, the authors seem to imply that typestate is dataflow analysis modified in (at least) two ways: 1) When control flow joins, the new typestate is the intersection of typestates coming into the join, where as dataflow analysis doesn't guarantee that. (They imply they think dataflow analysis is allowed to say the variable might or might not be initialized here, while typestate would ensure the variable is uninitialized.) Right, but this is a disadvantage of their typestate algorithm. It is why the example in Figure 2 of http://www.cs.ubc.ca/local/reading/proceedings/spe91-95/spe/vol25/issue4/spe950wk.pdf fails to check, even though it obviously initializes all variables. Consider the equivalent Java program: public class LoopInitTest { public static void main(String[] args) { boolean b; int i; while (true) { b = true; if (b) { v = 1; } v = v + 1; } } } As it happens, javac rejects this: LoopInitTest.java:12: variable v might not have been initialized v = v + 1; ^ but for a different and more trivial reason than the Hermes algorithm. Change if (b) { v = 1; } to just v = 1;, and the Java version will be accepted by its definite assignment analysis (which is a dataflow analysis), but the equivalent Hermes program still would not. -- David Hopwood [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall [EMAIL PROTECTED] wrote: If the relations are to be considered opaque, then there's clearly no aliasing going on. Not certain I understand, but I think I agree. My condition, though, was that relations be opaque. Since you will be violating that condition further on down, I just felt that it's useful to point that out here. No, such a language is entirely useful, since relational assignment is *more* expressive than insert/update/delete. Assignment is more powerful *as* assignment. However, it is less powerful when the task at hand is deriving new relations from old ones. Assignment provides absolutely no tools for doing that. I thought you were trying to remove those tools from the language entirely in order to remove the corresponding aliasing problems. I guess I was wrong, since you make it clear below that you intend to keep at least basic set operations on relations in your hypothetical language. Consider: i := i + 1; Note that the new value of i didn't just appear out of thin air; it was in fact based on the previous value of i. Right. That's exactly the kind of thing I thought you were trying to avoid. So we can define insert, update and delete in terms of relational assignment, relational subtraction, and relational union. Type checking details omitted. Then the problem is in the step where you assign the new relation to the old relational variable. You need to check that the new relation conforms to the invariants that are expressed on that relational variable. If you model this as assignment of relations (or relation values... I'm unclear on the terminology at this point) then naively this requires scanning through an entire set of relations in the constraint, to verify that the invariant holds. You've may have avoided aliasing in any conventional sense of the word by stretching the word itself beyond breaking... but you've only done it by proactively accepting its negative consequences. It remains non-trivial to scan through a 2 GB database table to verify that some attribute of every tuple matches some attribute of another table, even if you call the entire thing one relational variable. The implementation, of course, isn't at all going to make a copy of the entire (possibly several GB) relation and rewrite it all every time it makes a change, and it isn't going to give up and rescan all possible invariants every time every change is made. In other words, you've risen to a layer of abstraction where the aliasing problem does not exist. The implementation is still going to deal with the aliasing problem, which will resurface once you pass over to the other side of the abstraction boundary. -- Chris Smith - Lead Software Developer / Technical Trainer MindIQ Corporation -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error? [correction]
David Hopwood wrote: Darren New wrote: From what I can determine, the authors seem to imply that typestate is dataflow analysis modified in (at least) two ways: 1) When control flow joins, the new typestate is the intersection of typestates coming into the join, where as dataflow analysis doesn't guarantee that. (They imply they think dataflow analysis is allowed to say the variable might or might not be initialized here, while typestate would ensure the variable is uninitialized.) Right, but this is a disadvantage of their typestate algorithm. It is why the example in Figure 2 of http://www.cs.ubc.ca/local/reading/proceedings/spe91-95/spe/vol25/issue4/spe950wk.pdf fails to check, even though it obviously initializes all variables. Consider the equivalent Java program: I mixed up Figures 1 and 2. Here is the Java program that Figure 2 should be compared to: public class LoopInitTest { public static String getString() { return foo; } public static void main(String[] args) { String line = getString(); boolean is_last = false; while (!is_last) { if (line.charAt(0) == 'q') { is_last = true; } // insert line into inputs (not important for analysis) if (!is_last) { line = getString(); } } } } which compiles without error, because is_last is definitely initialized. -- David Hopwood [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error? [correction]
David Hopwood wrote: public class LoopInitTest { public static String getString() { return foo; } public static void main(String[] args) { String line = getString(); boolean is_last = false; while (!is_last) { if (line.charAt(0) == 'q') { is_last = true; } // insert line into inputs (not important for analysis) if (!is_last) { line = getString(); } } } } which compiles without error, because is_last is definitely initialized. At what point do you think is_last or line would seem to not be initialized? They're both set at the start of the function, and (given that it's Java) nothing can unset them. At the start of the while loop, it's initialized. At the end of the while loop, it's initialized. So the merge point of the while loop has it marked as initialized. Now, if the insert line into inputs actually unset line, then yes, you're right, Hermes would complain about this. Alternately, if you say if (x) v = 1; if (x) v += 1; then Hermes would complain when it wouldn't need to. However, that's more a limitation of the typestate checking algorithms than the concept itself; that is to say, clearly the typestate checker could be made sufficiently intelligent to track most simple versions of this problem and not complain, by carrying around conditionals in the typestate description. -- Darren New / San Diego, CA, USA (PST) This octopus isn't tasty. Too many tentacles, not enough chops. -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error? [correction]
Darren New wrote: Now, if the insert line into inputs actually unset line, then yes, you're right, Hermes would complain about this. Oh, I see. You translated from Hermes into Java, and Java doesn't have the insert into statement. Indeed, the line you commented out is *exactly* what's important for analysis, as it unsets line. Had it been insert copy of line into inputs then you would not have gotten any complaint from Hermes, as it would not have unset line there. In this case, it's equivalent to if (!is_line) line = getString(); if (!is_line) use line for something... except the second test is at the top of the loop instead of the bottom. -- Darren New / San Diego, CA, USA (PST) This octopus isn't tasty. Too many tentacles, not enough chops. -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Chris Smith wrote: Marshall [EMAIL PROTECTED] wrote: If the relations are to be considered opaque, then there's clearly no aliasing going on. Not certain I understand, but I think I agree. My condition, though, was that relations be opaque. Since you will be violating that condition further on down, I just felt that it's useful to point that out here. No, such a language is entirely useful, since relational assignment is *more* expressive than insert/update/delete. Assignment is more powerful *as* assignment. However, it is less powerful when the task at hand is deriving new relations from old ones. At the implementation level, it makes some things harder, however as a logical model, it is more powerful. While this is very much a real world issue, it is worth noting that it is a performance issue *merely* and not a semantic issue. Assignment provides absolutely no tools for doing that. I thought you were trying to remove those tools from the language entirely in order to remove the corresponding aliasing problems. I guess I was wrong, since you make it clear below that you intend to keep at least basic set operations on relations in your hypothetical language. Consider: i := i + 1; Note that the new value of i didn't just appear out of thin air; it was in fact based on the previous value of i. Right. That's exactly the kind of thing I thought you were trying to avoid. I was under the impression tat Joachim, for example, did not consider i+1 as an alias for i. So we can define insert, update and delete in terms of relational assignment, relational subtraction, and relational union. Type checking details omitted. Then the problem is in the step where you assign the new relation to the old relational variable. You need to check that the new relation conforms to the invariants that are expressed on that relational variable. If you model this as assignment of relations (or relation values... I'm unclear on the terminology at this point) then naively this requires scanning through an entire set of relations in the constraint, to verify that the invariant holds. You've may have avoided aliasing in any conventional sense of the word by stretching the word itself beyond breaking... but you've only done it by proactively accepting its negative consequences. It remains non-trivial to scan through a 2 GB database table to verify that some attribute of every tuple matches some attribute of another table, even if you call the entire thing one relational variable. The implementation, of course, isn't at all going to make a copy of the entire (possibly several GB) relation and rewrite it all every time it makes a change, and it isn't going to give up and rescan all possible invariants every time every change is made. In other words, you've risen to a layer of abstraction where the aliasing problem does not exist. The implementation is still going to deal with the aliasing problem, which will resurface once you pass over to the other side of the abstraction boundary. Yes, these *performance* issues make assignment prohibitive for real-world use, at least if we are talking about data management in the large. This is not the same thing as saying the resulting language is a toy language, though; its semantics are quite interesting and possibly a better choice for *defining* the semantics of the imperative operations than directly modelling the imperative operations. (Or maybe not.) In any event, it's worth thinking about, even if performance considerations make it not worth implementing. Marshall -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall [EMAIL PROTECTED] wrote: Yes, these *performance* issues make assignment prohibitive for real-world use, at least if we are talking about data management in the large. This is not the same thing as saying the resulting language is a toy language, though; its semantics are quite interesting and possibly a better choice for *defining* the semantics of the imperative operations than directly modelling the imperative operations. (Or maybe not.) In any event, it's worth thinking about, even if performance considerations make it not worth implementing. My toy language comment was directed at a language that I mistakenly thought you were proposing, but that you really weren't. You can ignore it, and all the corresponding comments about assignment being less powerful, etc. I was apparently not skilled at communication when I tried to say that in the last message. It is, perhaps, worth thinking about. My assertion here (which I think I've backed up, but there's been enough confusion that I'm not surprised if it was missed) is that the underlying reasons that performance might be poor for this language are a superset of the performance problems caused by aliasing. Hence, when discussing the problems caused by aliasing for the performance of language implementations (which I believe was at some point the discussion here), this isn't a particularly useful example. It does, though, have the nice property of hiding the aliasing from the semantic model. That is interesting and worth considering, but is a different conversation; and I don't know how to start it. -- Chris Smith - Lead Software Developer / Technical Trainer MindIQ Corporation -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error? [correction]
Darren New wrote: David Hopwood wrote: public class LoopInitTest { public static String getString() { return foo; } public static void main(String[] args) { String line = getString(); boolean is_last = false; while (!is_last) { if (line.charAt(0) == 'q') { is_last = true; } // insert line into inputs (not important for analysis) if (!is_last) { line = getString(); } } } } which compiles without error, because is_last is definitely initialized. At what point do you think is_last or line would seem to not be initialized? They're both set at the start of the function, and (given that it's Java) nothing can unset them. At the start of the while loop, it's initialized. At the end of the while loop, it's initialized. So the merge point of the while loop has it marked as initialized. Apparently, Hermes (at least the version of it described in that paper) essentially forgets that is_last has been initialized at the top of the loop, and so when it does the merge, it is merging 'not necessarily initialized' with 'initialized'. This sounds like a pretty easy thing to fix to me (and maybe it was fixed later, since there are other papers on Hermes' typestate checking that I haven't read yet). -- David Hopwood [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall schrieb: Joachim Durchholz wrote: Marshall schrieb: I would say, records in SQL have value, and their identity is exactly their value. Definitely not. You can have two equal records and update just one of them, yielding non-equal records; by my definition (and by intuition), this means that the records were equal but not identical. This sort of thing comes up when one has made the mistake of not defining any keys on one's table and needs to rectify the situation. It is in fact considered quite a challenge to do. My preferred technique for such a situation is to create a new table with the same columns and SELECT DISTINCT ... INTO ... then recreate the original table. So that way doesn't fit your model. I was mentioning multiple equal records just as an example where the records have an identity that's independent of the record values. How else might you do it? I suppose there are nonstandard extensions, such as UPDATE ... LIMIT. And in fact there might be some yucky thing that made it in to the standard that will work. Right, it might be difficult to update multiple records that are exactly the same. It's not what SQL was intended for, and difficult to do anyway - I was thinking of LIMIT (I wasn't aware that it's nonstandard), and I agree that there may be other ways to do it. However, I wouldn't overvalue that case. The Jane/John Doe example posted elsewhere in this thread is strictly within the confines of what SQL was built for, yet there is aliasing. But what you descrbe is certainly *not* possible in the relational algebra; alas that SQL doesn't hew closer to it. Would you agree? Yup, SQL (particularly its update semantics) aren't relational semantics. Still, it's SQL we've been talking about. Would you also agree that if a language *did* adhere to relation semantics, then relation elements would *not* have identity? (Under such a language, one could not have two equal records, for example.) Any identity that one could define within relational semantics would be just equality, so no, it wouldn't be very helpful (unless as part of a greater framework). However, that's not really a surprise. Aliasing becomes relevant (even observable) only when there's updates, and relational algebra doesn't have updates. I do not see that they have any identity outside of their value. We can uniquely identify any particular record via a key, but a table may have more than one key, and an update may change the values of one key but not another. So it is not possible in general to definitely and uniquely assign a mapping from each record of a table after an update to each record of the table before the update, and if you can't do that, then where is the record identity? Such a mapping is indeed possible. Simply extend the table with a new column, number the columns consecutively, and identify the records via that column. That doesn't work for me. It is one thing to say that for all tables T, for all elements e in T, e has identity. It is a different thing to say that for all tables T, there exists a table T' such that for all elements e in T', e has identity. Let me continue that argument: Records in T' have identity. From an SQL point-of-view, there's no difference between the records in T' and records in other tables, so they must share all properties, in particular that of having an identity. (I agree that's not as convincing as seeing aliasing in action. See below.) But even if you don't do that, there's still identity. It is irrelevant whether the programs can directly read the value of the identity field; the adverse effects happen because updates are in-place. (If every update generated a new record, then we'd indeed have no identity.) Okay. At this point, though, the term aliasing has become extremely general. I believe i+1+1 is an alias for i+2 under this definition. No, i+1+1 isn't an alias in itself. It's an expression - to be an alias, it would have to be a reference to something. However, a[i+1+1] is an alias to a[i+2]. Not that this is particularly important - 1+1 is replacable by 2 in every context, so this is essentially the same as saying a[i+2] is an alias of a[i+2], which is vacuously true. To me, the SQL with the where clause is like i+2, not like a[2]. I'd say WHERE is like [i+2]: neither is valid on its own, it's the selector part of a reference into a table. A query is simply an expression that makes mention of a variable. However I would have to admit that if SQL table elements have identity, it would be more like the array example. OK. (The model of SQL I use in my head is one with set semantics, and I am careful to avoid running in to bag semantics by, for example, always defining at least one key, and using SELECT DISTINCT where necessary. But it is true that the strict set semantics in my head are not the wobbly bag semantics in the standard.) Set-vs.-bag doesn't affect SQL's
Re: What is a type error?
Marshall [EMAIL PROTECTED] wrote: In general, I feel that records are not the right conceptual level to think about. Unfortunately, they are the right level. Actually,the right level might even be lower, the fields within a record, but that's moving even farther away from the direction you wish to go. The right level is the lowest level at which the programmer can see and manipulate values. Thus, since a programmer can see and manipulate fields within records in SQL that is the level we need to concern ourselves with when talking about aliasing in SQL. In other words, since a SELECT (or UPDATE) statement can talk about specific fields of specific records within the table (not directly, but reliably, which I'll explain in a moment) then those are the primitive objects in SQL that can be aliased. What I meant by not directly, but reliably: If you have a fixed database, and you do two selects which specify the same sets of fields to be selected and the same keys to select records with, one expects the two selects to return the same values. You do not necessarily expect the records to be returned in the same order, but you do expect the same set of records to be returned and the records to have the same values in the same fields. Further, if you do an update, you expect certain fields of certain records to change (and be reflected in subsequent selects). However, therein lies the rub, if you do a select on some records, and then an update that changes those records, the records you have from the select have either changed or show outdated values. If there is some way to refer to the records you first selected before the update, then you have an aliasing problem, but maybe one can't do that. One could also have an aliasing problem, if one were allowed to do two updates simultaneously, so that one update could changed records in the middle of the other changing the records. However, I don't know if that's allowed in the relational algebra either. (I think I finally see your point, and more on that later.) I do not see that they have any identity outside of their value. We can uniquely identify any particular record via a key, but a table may have more than one key, and an update may change the values of one key but not another. So it is not possible in general to definitely and uniquely assign a mapping from each record of a table after an update to each record of the table before the update, and if you can't do that, then where is the record identity? I don't know if your point of having two keys within one table amkes a difference. If one can uniquely identify a record, then it has identity. The fact that there is another mapping where one may not be able to uniquely identify the record is not necessarily relevant. But what you descrbe is certainly *not* possible in the relational algebra; alas that SQL doesn't hew closer to it. Would you agree? Would you also agree that if a language *did* adhere to relation semantics, then relation elements would *not* have identity? (Under such a language, one could not have two equal records, for example.) Ok, here I think I will explain my understanding of your point. If we treat each select statement and each update statements as a separate program, i.e. we can't save records from one select statement to a later one (and I think in the relational algebra one cannot), then each single (individual) select statement or update statement is a functional program. That is we can view a single select statement as doing a fold over some incoming data, but it cannot change the data. Similarly, we can view an update statement as creating a new copy of the table with mutations in it, but the update statement can only refer to the original immutable table that was input. (Well, that's atleast how I recall the relational algebra.) It is the second point, about the semantics of the update statement which is important. There are no aliasing issues because in the relational algebra one cannot refer to the mutated values in an update statement, every reference in an update statement is refering to the unmutated input (again, that's my recollection). (Note, if my recollection is off, and one can refer to the mutated records in an update statement, perhaps they can explain why aliasing is not an issue in it.) Now for some minor points: For example, we might ask in C, if we update a[i], will the value of b[j] change? And we can't say, because we don't know if there is aliasing. But in Fortran, we can say that it won't, because they are definitely different variables. Unfortunately, your statement about FORTRAN is not true, if a and b are parameters to a subroutine (or members of a common block, or equivalenced, or ...), then an update of a[i] might change b[j]. This is in fact, an important point, it is in subroutines (and subroutine like things), where we have local names for things (bindings) that are actually names for things which may
Re: What is a type error?
Chris F Clark schreef: If you have a fixed database, and you do two selects which specify the same sets of fields to be selected and the same keys to select records with, one expects the two selects to return the same values. When your fixed means read-only, or (fully) locked, then yes. Modern databases also have modes without locks, so without special attention (like maybe your fixed) the two selects do not necessarily return the same set of records. Further, if you do an update, you expect certain fields of certain records to change (and be reflected in subsequent selects). Doesn't your fixed block updates? However, therein lies the rub, if you do a select on some records, and then an update that changes those records, the records you have from the select have either changed or show outdated values. Not necessarily outdated: values can be fixed in time for a purpose. Example: calculating interest is done once per day, but the whole process takes more than a day. Some systems implemented a lock plus requery just before the update, to check for unacceptable changes in the stored data; this to prevent having to keep locks while waiting. If there is some way to refer to the records you first selected before the update, then you have an aliasing problem, but maybe one can't do that. One could also have an aliasing problem, if one were allowed to do two updates simultaneously, so that one update could changed records in the middle of the other changing the records. Some databases allow you to travel back in time: run this query on the data of 1 year ago. All previous values are kept behind the current value. -- Affijn, Ruud Gewoon is een tijger. -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Joachim Durchholz wrote: Marshall schrieb: But what you descrbe is certainly *not* possible in the relational algebra; alas that SQL doesn't hew closer to it. Would you agree? Yup, SQL (particularly its update semantics) aren't relational semantics. Still, it's SQL we've been talking about. Would you also agree that if a language *did* adhere to relation semantics, then relation elements would *not* have identity? (Under such a language, one could not have two equal records, for example.) Any identity that one could define within relational semantics would be just equality, so no, it wouldn't be very helpful (unless as part of a greater framework). However, that's not really a surprise. Aliasing becomes relevant (even observable) only when there's updates, and relational algebra doesn't have updates. Good point. Perhaps I should have said relational algebra + variables with assignment. It is interesting to consider assignment vs. the more restricted update operators: insert, update, delete. I think I see what you mean about the restricted update operators working on identity. Okay. At this point, though, the term aliasing has become extremely general. I believe i+1+1 is an alias for i+2 under this definition. No, i+1+1 isn't an alias in itself. It's an expression - to be an alias, it would have to be a reference to something. However, a[i+1+1] is an alias to a[i+2]. Not that this is particularly important - 1+1 is replacable by 2 in every context, so this is essentially the same as saying a[i+2] is an alias of a[i+2], which is vacuously true. To me, the SQL with the where clause is like i+2, not like a[2]. I'd say WHERE is like [i+2]: neither is valid on its own, it's the selector part of a reference into a table. Is it possible you're being distracted by the syntax? WHERE is a binary operation taking a relation and a filter function. I don't think of filters as being like array indexing; do they appear analogous to you? (Always a difficult question because often times A and B share some qualities and not others, and what does one call that?) However the situation in SQL strikes me as being different. If one is speaking of base tables, and one has two queries, one has everything one needs to know simply looking at the queries. If views are involved, one must further examine the definition of the view. Sure. Usually, SQL itself is enough abstract that no additional abstraction happens; so even if you have aliasing, you usually know about it. I.e. when compared to what you were saying about Java: It is possible to have multiple references to a single Java mutable object and not know it., the and not know it part is usually missing. Yes! The picture changes as soon as the SQL statements get hidden behind layers of abstraction, say result sets and prepared query handles. ... Um... let me also restate the preconditions for aliasing become a problem. You need three elements for that: 1) Identities that can be stored in multiple places. 2) Updates. 3) Abstraction (of updatable entities). Without identities, there cannot be aliasing. Without updates, the operation that makes aliasing problematic is excluded. Without abstraction, you may have aliasing but can clearly identify it, and don't have a problem. Aha! That description works very well for me. I admit that identity cannot be reliably established in SQL. The identity of a record isn't available (unless via OID extensions). However, it is there and it can have effect. Yes, I think I see what you mean now. This is in part a consequence of the restricted update operators. Thanks, Marshall -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Chris F Clark wrote: Marshall [EMAIL PROTECTED] wrote: In general, I feel that records are not the right conceptual level to think about. Unfortunately, they are the right level. Actually,the right level might even be lower, the fields within a record, but that's moving even farther away from the direction you wish to go. The right level is the lowest level at which the programmer can see and manipulate values. But how is this not always the bit? I do not see that they have any identity outside of their value. We can uniquely identify any particular record via a key, but a table may have more than one key, and an update may change the values of one key but not another. So it is not possible in general to definitely and uniquely assign a mapping from each record of a table after an update to each record of the table before the update, and if you can't do that, then where is the record identity? I don't know if your point of having two keys within one table amkes a difference. If one can uniquely identify a record, then it has identity. The fact that there is another mapping where one may not be able to uniquely identify the record is not necessarily relevant. The issue with two keys is that the keys may not *agree* on the mapping before vs. after the update. For example, we might ask in C, if we update a[i], will the value of b[j] change? And we can't say, because we don't know if there is aliasing. But in Fortran, we can say that it won't, because they are definitely different variables. Unfortunately, your statement about FORTRAN is not true, if a and b are parameters to a subroutine (or members of a common block, or equivalenced, or ...), then an update of a[i] might change b[j]. Ah, common blocks. How that takes me back! But your point is a good one; I oversimplified. Marshall again: Fair enough. I could only remember three, but I was sure someone else could name more. :-) There actual are some more, but very rare, for example there was call- by-text-string, which is sort of like call-by-name, but allows a parameter to reach into the calling routine and mess with local variables therein. Most of the rare ones have really terrible semantics, and are perhaps best forgotten except to keep future implementors from choosing them. For example, call-by-text-string is really easy to implement in a naive interpreter that reparses each statement at every invocation, but but it exposes the implementation properties so blatantly that writing a different interpretor is nearly impossible. If I recall correctly, there are also some call-by- methods that take into account networks and addressing space issues--call-by-reference doesn't work if one cannot see the thing being referenced. Of coruse, one must then ask whether one considers remote-procedure-call and/or message-passing to be the same thing as a local call. One last nit, Call-by-value is actually call-by-copy-in-copy-out when generalized. There are some principles that one can use to organize the parameter passing methods. However, the space is much richer than people commonly know about (and that is actually a good thing, that most people aren't aware of the richness, simplicity is good). Fair enough. Marshall -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
We Chris's stick together, as always. Marshall [EMAIL PROTECTED] wrote: Unfortunately, they are the right level. Actually,the right level might even be lower, the fields within a record, but that's moving even farther away from the direction you wish to go. The right level is the lowest level at which the programmer can see and manipulate values. But how is this not always the bit? First of all, we should be consistent in speaking about the logical language semantics, which may not include bits anyway. That said, if bits were the primitive concept of data in a language, then we'd be able to get away with talking about higher-level concepts because we agree to always manipulate a larger structure (a byte, for example) as a pure value. If we were using bitwise operators, then we wouldn't be able to get away with it, for example. That said, it's also quite possible to consider aliasing on higher levels as well; it's just not possible to point out the lack of aliasing for higher levels of abstraction, and thus conclude that no aliasing exists. Aliasing is still possible for entities within those layers of abstraction. -- Chris Smith - Lead Software Developer / Technical Trainer MindIQ Corporation -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall [EMAIL PROTECTED] wrote: Is it possible you're being distracted by the syntax? WHERE is a binary operation taking a relation and a filter function. I don't think of filters as being like array indexing; do they appear analogous to you? (Always a difficult question because often times A and B share some qualities and not others, and what does one call that?) This is definitely a good question. Nevertheless, I think we're fooling ourselves if we decide that we've made progress by defining part of the problem out of existence. The original problem was that when there are several ways of getting access to something that is identical (rather than just equal), this causes problems for invariant checking. These several ways of accessing data are being called 'aliasing' -- I have no comment on whether that's standard usage or not, because I don't know. I suspect that aliasing is normally used for something closer to the physical level. The point is that whatever aliasing really means, what we're discussing is having two paths by which one may access identical data. So there are infinitely complex ways in which this can occur. I can have pointers that are aliased. I can have integers, which by themselves are just plain values, but can alias each other as indices into an array. I can have strings that do the same thing for an associative array. I can, of course, have arbitrarily more complex data structures with exactly the same issue. I can also have two different WHERE clauses, which when applied to the same relation yield exactly the same set of tuples. The problem arises when code is written to update some value (or set of tuples, in the SQL case, since definitions differ there) using one pointer/int/string/WHERE clause/etc, and at the same time an invariant was written using the other pointer/int/WHERE/etc, the result is that either of: a) The compiler has to be able to prove that the two are not aliased b) The compiler has to re-evaluate the invariant from scratch when this operation is performed. (Yeah, there's actually a whole continuum between a and b, based on more limited abilities of the compiler to prove things. I'm oversimplifying, and I think it's rather benign in this case.) So in this sense, a WHERE clause is a binary operation in exactly the same way that an array indexing expression is a binary operation, and both are capable of aliasing in at least this logical sense. Now it's certainly true that languages with unrestricted pointers are less successful at limiting the scope of re-evaluation of invariants. In the array or SQL relation cases, there's a limited set of value/tuple modifications that might cause the invariant to need rechecking (specifically those that point to the same array, or to the relation or one of its views). A language with unrestricted untyped pointers, if it doesn't get lucky with the capture analysis, may have to re-evaluate all invariants in the application on any given assignment. Nevertheless, the re-evaluation of the invariant still needs to be done. -- Chris Smith - Lead Software Developer /Technical Trainer MindIQ Corporation -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall schrieb: Good point. Perhaps I should have said relational algebra + variables with assignment. It is interesting to consider assignment vs. the more restricted update operators: insert, update, delete. Actually I see it the other way round: assignment is strictly less powerful than DML since it doesn't allow creating or destroying variables, while UPDATE does cover assignment to fields. (However, it's usually new+assignment+delete vs. INSERT+UPDATE+DELETE, at which point there is not much of a difference.) Okay. At this point, though, the term aliasing has become extremely general. I believe i+1+1 is an alias for i+2 under this definition. No, i+1+1 isn't an alias in itself. It's an expression - to be an alias, it would have to be a reference to something. However, a[i+1+1] is an alias to a[i+2]. Not that this is particularly important - 1+1 is replacable by 2 in every context, so this is essentially the same as saying a[i+2] is an alias of a[i+2], which is vacuously true. To me, the SQL with the where clause is like i+2, not like a[2]. I'd say WHERE is like [i+2]: neither is valid on its own, it's the selector part of a reference into a table. Is it possible you're being distracted by the syntax? I think that's very, very unlikely ;-) WHERE is a binary operation taking a relation and a filter function. I don't think of filters as being like array indexing; do they appear analogous to you? Yes. Filters are just like array indexing: both select a subset of variables from a collection. In SQL, you select a subset of a table, in a programming language, you select a subset of an array. (The SQL selection mechanism is far more flexible in the kinds of filtering you can apply, while array indexing allows filtering just by ordinal position. However, the relevant point is that both select things that can be updated.) I admit that identity cannot be reliably established in SQL. The identity of a record isn't available (unless via OID extensions). However, it is there and it can have effect. Yes, I think I see what you mean now. This is in part a consequence of the restricted update operators. I don't think so. You can update SQL records any way you want. The unavailability of a record identity is due to the fact that, well, it's unavailable in standard SQL. Regards, Jo -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Joachim Durchholz wrote: Marshall schrieb: Good point. Perhaps I should have said relational algebra + variables with assignment. It is interesting to consider assignment vs. the more restricted update operators: insert, update, delete. Actually I see it the other way round: assignment is strictly less powerful than DML since it doesn't allow creating or destroying variables, while UPDATE does cover assignment to fields. Oh, my. Well, for all table variables T, there exists some pair of values v and v', such that we can transition the value of T from v to v' via assignment, but not by any single insert, update or delete. The reverse is not true. I would consider that a solid demonstration of which one is more powerful. Further, it is my understanding that your claim of row identity *depends* on the restricted nature of DML; if the only mutator operation is assignment, then there is definitely no record identity. (However, it's usually new+assignment+delete vs. INSERT+UPDATE+DELETE, at which point there is not much of a difference.) I am not sure what this means. Delete can be expressed in terms of assignment. (As can insert and update.) (Assignment can also be expressed in terms of insert and delete.) I don't know what new would be in a value-semantics, relational world. So I would say it's assignment vs. INSERT+UPDATE+DELETE, but perhaps I'm not understanding what you mean. Assignment by itself is complete. Insert and delete together are complete. Okay. At this point, though, the term aliasing has become extremely general. I believe i+1+1 is an alias for i+2 under this definition. No, i+1+1 isn't an alias in itself. It's an expression - to be an alias, it would have to be a reference to something. However, a[i+1+1] is an alias to a[i+2]. Not that this is particularly important - 1+1 is replacable by 2 in every context, so this is essentially the same as saying a[i+2] is an alias of a[i+2], which is vacuously true. To me, the SQL with the where clause is like i+2, not like a[2]. I'd say WHERE is like [i+2]: neither is valid on its own, it's the selector part of a reference into a table. Is it possible you're being distracted by the syntax? I think that's very, very unlikely ;-) Well, I just thought I'd ask. :-) WHERE is a binary operation taking a relation and a filter function. I don't think of filters as being like array indexing; do they appear analogous to you? Yes. Filters are just like array indexing: both select a subset of variables from a collection. I can't agree with this wording. A filter produces a collection value from a collection value. I don't see how variables enter in to it. One can filter either a collection constant or a collection variable; if one speaks of filtering a collection variable, on is really speaking of filtering the collection value that the variable currently contains; filtering is not an operation on the variable as such, the way the address of operator is. Note you can't update the result of a filter. In SQL, you select a subset of a table, in a programming language, you select a subset of an array. (The SQL selection mechanism is far more flexible in the kinds of filtering you can apply, while array indexing allows filtering just by ordinal position. However, the relevant point is that both select things that can be updated.) When you have been saying select things that can be updated I have been assuming you meant that one can derive values from variables, and that some other operation can update that variable, causing the expression, if re-evaluated, to produce a different value. However the phrase also suggests that you mean that the *result* of the select can *itself* be updated. Which one do you mean? (Or is there a third possibility?) I admit that identity cannot be reliably established in SQL. The identity of a record isn't available (unless via OID extensions). However, it is there and it can have effect. Yes, I think I see what you mean now. This is in part a consequence of the restricted update operators. I don't think so. You can update SQL records any way you want. The unavailability of a record identity is due to the fact that, well, it's unavailable in standard SQL. I disagree. The concept of record-identity is quite tenuous, and depends on specifics of SQL semantics. (In fact I'm not entirely convinced it's definitely there even in SQL.) It certainly does not hold in, say, set theory. Set elements do not have identity; they only have value. If we have a set-semantics language that supports set variables with assignment, we still do not have element-identity. Marshall -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
David Hopwood [EMAIL PROTECTED] wrote: Chris Smith wrote: If checked by execution, yes. In which case, I am trying to get my head around how it's any more true to say that functional languages are compilable postconditions than to say the same of imperative languages. A postcondition must, by definition [*], be a (pure) assertion about the values that relevant variables will take after the execution of a subprogram. Okay, my objection was misplaced, then, as I misunderstood the original statement. I had understood it to mean that programs written in pure functional languages carry no additional information except for their postconditions. -- Chris Smith - Lead Software Developer / Technical Trainer MindIQ Corporation -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Darren New [EMAIL PROTECTED] wrote: I'm not sure what linear or uniqueness typing is. It's typestate, and if I remember correctly the papers I read 10 years ago, the folks at TJWatson that invented Hermes also invented the concept of typestate. They at least claim to have coined the term. Coining the term is one thing, but I feel pretty confident that the idea was not invented in 1986 with the Hermes language, but rather far earlier. Perhaps they may have invented the concept of considering it any different from other applications of types, though. I still have trouble delineating how to consider typestate different from ordinary types in formal terms, unless I make the formal model complex enough to include some kind of programmer-defined identifiers such as variables. The distinction is not at all relevant to common type system models based on the lambda calculus. While acknowledging, on the one hand, that the word typestate is used to describe this, I also object that types have *always* been assigned to expressions in differing type environments. Otherwise, it would be impossible to assign types to lambda abstractions in the simply typed lambda calculus, the simplest of all generally studied type systems. What is being named here is the overcoming of a limitation that programming language designers imposed upon themselves, whether from not understanding the theoretical research or not believing it important, I don't know. -- Chris Smith - Lead Software Developer / Technical Trainer MindIQ Corporation -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Andreas Rossberg [EMAIL PROTECTED] writes: Uh, aliasing all over the place! Actually, I think that logic programming, usually based on deep unification, brings by far the worst incarnation of aliasing issues to the table. I agree that deep unification, as implemented in Prolog, brings with it lots of aliasing problems. However, these problems have been recognized from the seventies, and people have tried to solve them. There have been a huge variety of mode systems added to various dialects of Prolog over the years, which each attempt to tackle (various parts of) the aliasing problem, none of them fully successfully in my view, since their designers usually actually *wanted* to retain at least *some* of the expressive power of the logic variable. Some non-Prolog logic languages have departed from this approach, the earliest being the Relational Language. To tie this message to another thread, the Relational Language had a strict mode system that is as identical as possible to the notion of typestate in NIL and Hermes given the limitations of comparing declarative apples with imperative oranges, but significantly earlier, 1979 vs 1986 IIRC. Marshall wrote: I have explored the OO path to its bitter end and am convinced it is not the way. So what is left? Uniqueness types and logic programming, I suppose. I enjoy logic programming but it doesn't seem quite right. But notice: no pointers there! And it doesn't seem to suffer from the lack. Of course, the main logic programming language today that disallows the use of unification for arbitrary aliasing is Mercury. It enforces this through a strong mode system. Our motivation for the design of this mode system was precisely to eliminate the problems Andreas identified above. However it has also turned out to be an excellent foundation for Mercury's notion of unique modes, its equivalent of uniqueness types, which Mercury uses to express I/O. Zoltan Somogyi [EMAIL PROTECTED] http://www.cs.mu.oz.au/~zs/ Department of Computer Science and Software Engineering, Univ. of Melbourne -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Joachim Durchholz wrote: You can have aliasing without pointers; e.g. arrays are fully sufficient. If i = j, then a [i] and a [j] are aliases of the same object. Marshall [EMAIL PROTECTED] writes: I am having a hard time with this very broad definition of aliasing. Would we also say that a[1+1] and a[2] are aliases? It seems to me, above, that we have only a, and with only one variable there can be no aliasing. As you can see from the replies, all these things which you do not consider aliasing are considered aliasing (by most everyone else, in fact these are exactly what is meant by aliasing). There is good reason for this. Let us explain this by looking again at the SQL example that has been kicked around some. SELECT * FROM persons WHERE name = John and SELECT * FROM persons WHERE surname = Doe Some set of records are shared by both select clauses. Those records are the aliased records. If the set is empty there is no problem. If we do not update the records from one of the select clauses, we also have no problem. However, if we update the records for one of the select clauses and the set of aliased records is not empty, the records returned by the other select clause have changed. Therefore, everything we knew to be true about the other select clause may or may not still be true. It's the may not case we are worried about. For example, in the imperative Mi-5 Q asks Moneypenny to run the first query (name = John) to get a list of agents for infiltrating Spectre. In another department, they're doing sexual reassignments and updating the database with the second query (surname = Doe) and making the name become Jane (along with other changes to make the transformation correct). So, when Q select John Doe from the first list and sends that agent on a mission Q is really sending Jane Doe and Spectre realizes that the agent is one and kills her. Not a happy result. In the functional Mi-5, the sexual reassignment group, makes clones of the agents reassigned, so that there is now both a John Doe and a Jane Doe. Of course, the payroll is a little more bloated, so that when Q says John Doe isn't needed for any further assignments, the Garbage Collection department does a reduction in force and gets rid of John Doe. The good news is that they didn't send Jane Doe to her death. The problem with aliasing, we want the things we knew true in one part of our program, i.e. the Johns on Q's list are still Johns and not Janes, to be true after we've run another part of our program. If the second part of our program can change what was true after the first part of our program, we can't depend on the results from the first part of our program, i.e. Q can send Jane Doe into certain death. The problem is compounded by the complexities of our programs. When Q selected his list, he didn't know of the department doing the reassignments (and couldn't know because they were part of a top-secret project). So, since bureaucracies grow to meet the increasing needs of the bureaucracy, often the solution is increased complexity, more regulations and paperwork to fill out, semaphores and locks to hold on critical data, read-only data, status requests, etc. All to keep Q from killing Jane by accident. Sometimes they work. the reassignment department has to wait 30-days for the clearance to perform their operation in which time John Doe completes the infiltration of Spectre saves the world from destruction and is ready for his next assignment. The key point is that each record (and even each field in each record) if it can be known by two names, is an alias. It is not sufficient to talk about whole variables as not being aliased if there is some way to refer to some part of the variable and change that part of the variable. Thus, a[1+1] is an alias for a[2] if you can have one part of the code talking about one and another part of the code talking about the other. To put it one still final way, consider the following code; assert(sorted(array a)); a[1] = 2; assert(sorted(array a)); // is this still true? Because a[1] is an alias of array a, you cannot be certain that the 2nd assert will not fire (fail) without additional analysis. assert(sorted(array a)); assert(a[0] = 2); assert(2 = a[2]); a[1] = 2; assert(sorted(array a)); // this is still true! -Chris -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Joachim Durchholz wrote: Marshall schrieb: In some cases, you need an additional level of conceptual indirection - instead of *doing* the updates, you write a function that *describes* them. But then what do you do with that function? I pass it to an engine that's imperative. However, that engine doesn't need to do aliasing anymore. In other words, I'm separating mutability and aliasing, so that they can't combine and explode. This would seem to be identical to what I am proposing. Let's say I have an employee database. John Smith just got hired on 1/1/2006 with a salary of $10,000. I need to record this fact somewhere. How do I do that without variables? Current-employees is a variable. Even if I have the space to keep all historical data, so I'm not deleting anything, I still have to have a variable for the latest version of the accumulated data. I can solve this without pointers, but I can't solve it without variables. As I said elsewhere, the record has an identity even though it isn't explicit in SQL. H. What can this mean? In general, I feel that records are not the right conceptual level to think about. In any event, I am not sure what you mean by non-explicit identity. I would say, records in SQL have value, and their identity is exactly their value. I do not see that they have any identity outside of their value. We can uniquely identify any particular record via a key, but a table may have more than one key, and an update may change the values of one key but not another. So it is not possible in general to definitely and uniquely assign a mapping from each record of a table after an update to each record of the table before the update, and if you can't do that, then where is the record identity? (The situation is even more dire when one considers that SQL actually uses bag semantics and not set semantics.) [...] I should like to learn more about these. I have some vague perception of the existence of linear logic, but not much else. However, I also already have an excellent solution to the pointer aliasing problem, so I'm less motivated. Pointers are just the most obvious form of aliasing. As I said elsewhere, there are others: array indexing, by-reference parameters, or even WHERE clauses in SQL statements. In fact, anything that selects an updatable piece of data from a collection is an alias, and incurs the same problems as pointers, though they may come in different disguises. Okay. At this point, though, the term aliasing has become extremely general. I believe i+1+1 is an alias for i+2 under this definition. That is so general that I am concerned it has lost its ability to identify problems specific to pointers. Actually SQL has references - they are called primary keys, but they are references nevertheless. I strongly object; this is quite incorrect. I grant you that from the 50,000 foot level they appear identical, but they are not. To qualify as a reference, there need to be reference and dereference operations on the reference datatype; there is no such operation is SQL. Would you say the relational algebra has references? Yes. Or, consider the classic prolog ancestor query. Let's say we're setting up as follows father(bob, joe). father(joe, john). Is joe a reference, here? After all, when we do the ancestor query for john, we'll see his father is joe and then use that to find joe's father. Keys in SQL are isomorphic to joe in the above prolog. Agreed. They are both references ;-P Again, by generalizing the term this far, I am concerned with a loss of precision. If joe in the prolog is a references, then reference is just a term for data that is being used in a certain way. The conection with a specfic address space has been lost in favor of the full domain of the datatype. (Some SQL dialects also offer synthetic ID fields that are guaranteed to remain stable over the lifetime of a record. Primary keys are updatable; there is nothing special about them. Right - I was wrong with identifying keys and identities. In fact the identity of an SQL record cannot be directly obtained (unless via those ID fields). The records still have identities. It's possible to have two WHERE clauses that refer to the same record, and if you update the record using one WHERE clause, the record returned by the other WHERE clause will have changed, too. Is this any different from saying that an expression that includes a variable will produce a different value if the variable changes? In other words, i+1 will return a different value if we update i. It seems odd to me to suggest that i+1 has identity. I can see that i has identity, but I would say that i+1 has only value. But perhaps the ultimate upshoot of this thread is that my use of terminology is nonstandard. With a repeatable read isolation level, you actually return to a declarative view of the database:
Re: What is a type error?
Marshall schrieb: Joachim Durchholz wrote: As I said elsewhere, the record has an identity even though it isn't explicit in SQL. H. What can this mean? In general, I feel that records are not the right conceptual level to think about. They are, when it comes to aliasing of mutable data. I think it's justified by the fact that aliased mutable data has a galling tendency to break abstraction barriers. (More on this on request.) In any event, I am not sure what you mean by non-explicit identity. The identity isn't visible from inside SQL. (Unless there's an OID facility available, which *is* an explicit identity.) I would say, records in SQL have value, and their identity is exactly their value. Definitely not. You can have two equal records and update just one of them, yielding non-equal records; by my definition (and by intuition), this means that the records were equal but not identical. I do not see that they have any identity outside of their value. We can uniquely identify any particular record via a key, but a table may have more than one key, and an update may change the values of one key but not another. So it is not possible in general to definitely and uniquely assign a mapping from each record of a table after an update to each record of the table before the update, and if you can't do that, then where is the record identity? Such a mapping is indeed possible. Simply extend the table with a new column, number the columns consecutively, and identify the records via that column. But even if you don't do that, there's still identity. It is irrelevant whether the programs can directly read the value of the identity field; the adverse effects happen because updates are in-place. (If every update generated a new record, then we'd indeed have no identity.) Okay. At this point, though, the term aliasing has become extremely general. I believe i+1+1 is an alias for i+2 under this definition. No, i+1+1 isn't an alias in itself. It's an expression - to be an alias, it would have to be a reference to something. However, a[i+1+1] is an alias to a[i+2]. Not that this is particularly important - 1+1 is replacable by 2 in every context, so this is essentially the same as saying a[i+2] is an alias of a[i+2], which is vacuously true. There's another aspect here. If two expressions are always aliases to the same mutable, that's usually easy to determine; this kind of aliasing is usually not much of a problem. What's more of a problem are those cases where there's occasional aliasing. I.e. a[i] and a[j] may or may not be aliases of each other, depending on the current value of i and j, and *that* is a problem because the number of code paths to be tested doubles. It's even more of a problem because testing with random data will usually not uncover the case where the aliasing actually happens; you have to go around and construct test cases specifically for the code paths that have aliasing. Given that references may cross abstraction barriers (actually that's often the purpose of constructing a reference in the first place), this means you have to look for your test cases at multiple levels of software abstraction, and *that* is really, really bad. That is so general that I am concerned it has lost its ability to identify problems specific to pointers. If the reference to pointers above means references, then I don't know about any pointer problems that cannot be reproduced, in one form or the other, in any of the other aliasing mechanisms. Again, by generalizing the term this far, I am concerned with a loss of precision. If joe in the prolog is a references, then reference is just a term for data that is being used in a certain way. The conection with a specfic address space has been lost in favor of the full domain of the datatype. Aliasing is indeed a more general idea that goes beyond address spaces. However, identity and aliasing can be defined in fully abstract terms, so I welcome this opportunity to get rid of a too-concrete model. The records still have identities. It's possible to have two WHERE clauses that refer to the same record, and if you update the record using one WHERE clause, the record returned by the other WHERE clause will have changed, too. Is this any different from saying that an expression that includes a variable will produce a different value if the variable changes? Yes. Note that the WHERE clause properly includes array indexing (just set up a table that has continuous numeric primary key, and a single other column). I.e. I'm not talking about how a[i] is an alias of a[i+1] after updating i, I'm talking about how a[i] may be an alias of a[j]. It seems odd to me to suggest that i+1 has identity. It doesn't (unless it's passed around as a closure, but that's irrelevant to this discussion). i does have identity. a[i] does have identity. a[i+1] does have identity. Let me say that for
Re: What is a type error?
Joachim Durchholz wrote: Marshall schrieb: Joachim Durchholz wrote: As I said elsewhere, the record has an identity even though it isn't explicit in SQL. H. What can this mean? In general, I feel that records are not the right conceptual level to think about. They are, when it comes to aliasing of mutable data. I think it's justified by the fact that aliased mutable data has a galling tendency to break abstraction barriers. (More on this on request.) I can see how pointers, or other kinds of aliasing (by my more restricted definition) break abstraction barries; it is hard to abstract something that can change out from under you uncontrollably. In any event, I am not sure what you mean by non-explicit identity. The identity isn't visible from inside SQL. (Unless there's an OID facility available, which *is* an explicit identity.) Agreed about OIDs. I would say, records in SQL have value, and their identity is exactly their value. Definitely not. You can have two equal records and update just one of them, yielding non-equal records; by my definition (and by intuition), this means that the records were equal but not identical. This sort of thing comes up when one has made the mistake of not defining any keys on one's table and needs to rectify the situation. It is in fact considered quite a challenge to do. My preferred technique for such a situation is to create a new table with the same columns and SELECT DISTINCT ... INTO ... then recreate the original table. So that way doesn't fit your model. How else might you do it? I suppose there are nonstandard extensions, such as UPDATE ... LIMIT. And in fact there might be some yucky thing that made it in to the standard that will work. But what you descrbe is certainly *not* possible in the relational algebra; alas that SQL doesn't hew closer to it. Would you agree? Would you also agree that if a language *did* adhere to relation semantics, then relation elements would *not* have identity? (Under such a language, one could not have two equal records, for example.) [The above paragraph is the part of this post that I really care about; feel free to ignore the rest if it's naive or annoying or whatever, but please do respond to this.] I do not see that they have any identity outside of their value. We can uniquely identify any particular record via a key, but a table may have more than one key, and an update may change the values of one key but not another. So it is not possible in general to definitely and uniquely assign a mapping from each record of a table after an update to each record of the table before the update, and if you can't do that, then where is the record identity? Such a mapping is indeed possible. Simply extend the table with a new column, number the columns consecutively, and identify the records via that column. That doesn't work for me. It is one thing to say that for all tables T, for all elements e in T, e has identity. It is a different thing to say that for all tables T, there exists a table T' such that for all elements e in T', e has identity. But even if you don't do that, there's still identity. It is irrelevant whether the programs can directly read the value of the identity field; the adverse effects happen because updates are in-place. (If every update generated a new record, then we'd indeed have no identity.) Okay. At this point, though, the term aliasing has become extremely general. I believe i+1+1 is an alias for i+2 under this definition. No, i+1+1 isn't an alias in itself. It's an expression - to be an alias, it would have to be a reference to something. However, a[i+1+1] is an alias to a[i+2]. Not that this is particularly important - 1+1 is replacable by 2 in every context, so this is essentially the same as saying a[i+2] is an alias of a[i+2], which is vacuously true. To me, the SQL with the where clause is like i+2, not like a[2]. A query is simply an expression that makes mention of a variable. However I would have to admit that if SQL table elements have identity, it would be more like the array example. (The model of SQL I use in my head is one with set semantics, and I am careful to avoid running in to bag semantics by, for example, always defining at least one key, and using SELECT DISTINCT where necessary. But it is true that the strict set semantics in my head are not the wobbly bag semantics in the standard.) There's another aspect here. If two expressions are always aliases to the same mutable, that's usually easy to determine; this kind of aliasing is usually not much of a problem. What's more of a problem are those cases where there's occasional aliasing. I.e. a[i] and a[j] may or may not be aliases of each other, depending on the current value of i and j, and *that* is a problem because the number of code paths to be tested doubles. It's even more of a problem because testing with random data will
Re: What is a type error?
On 13 Jul 2006 08:45:49 -0700, Marshall [EMAIL PROTECTED] wrote: On the other hand, there is no problem domain for which pointers are a requirement. I agree they are deucedly convenient, though. I would argue that pointers/references _are_ a requirement for I/O. I know of no workable method for interpreting raw bits as meaningful data other than to overlay a typed template upon them. Categorically disallowing address manipulation functionally cripples the language because an important class of programs (system programs) cannot be written. Of course, languages can go overboard the other way too. IMO, C did not need to provide address arithmetic at the language level, reinterpretable references and array indexing would have sufficed for any use. Modula 3's type safe view is an example of getting it right. It is quite reasonable to say I don't write _ so I don't need [whatever language feature enables writing it]. It is important, however, to be aware of the limitation and make your choice deliberately. George -- for email reply remove / from address -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
George Neuner wrote: On 13 Jul 2006 08:45:49 -0700, Marshall [EMAIL PROTECTED] wrote: On the other hand, there is no problem domain for which pointers are a requirement. I agree they are deucedly convenient, though. I would argue that pointers/references _are_ a requirement for I/O. I know of no workable method for interpreting raw bits as meaningful data other than to overlay a typed template upon them. I think I know what you mean. I agree that pointers are necessary for, e.g., device drivers. So I have to weaken my earlier statement. Categorically disallowing address manipulation functionally cripples the language because an important class of programs (system programs) cannot be written. That's fair, although I could argue how important systems programming is these days. (And C/C++ are cock-of-the-walk there anyway.) Of course, languages can go overboard the other way too. IMO, C did not need to provide address arithmetic at the language level, reinterpretable references and array indexing would have sufficed for any use. Modula 3's type safe view is an example of getting it right. It is quite reasonable to say I don't write _ so I don't need [whatever language feature enables writing it]. It is important, however, to be aware of the limitation and make your choice deliberately. Agreed. Marshall -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Chris Smith wrote: Marshall [EMAIL PROTECTED] wrote: What you are asking for is some subset of identity, and I've not yet succeeded in understanding exactly what it is or what its limits are... except that so far, it seems to have everything to do with pointers or aliasing. Perhaps it is specifically first-class identity, rather than identity per se. As in: the value of one variable can (be/refer to/depend on) the identity of another variable? I can certainly see this as as reasonable concept to consider. Yes, I think it's important. We've become so accustomed, in modern languages, to considering variables as first-class, that we cannot see any more that many of the problems attributed to variables (such as aliasing) are actually problems only with *first-class* variables. The historical reality that non-first-class variables are associated with more antique languages (Fortran/pre-OO) makes us shy away from even considering removing first-class status from variables. But I think it's an important point in the design space (although I certainly respect Andreas' reluctance; to paraphrase, first class or bust. :-) ) After all, what are the alternatives? Purely-functional languages remove themselves from a large class of problems that I consider important: data management. I have explored the OO path to its bitter end and am convinced it is not the way. So what is left? Uniqueness types and logic programming, I suppose. I enjoy logic programming but it doesn't seem quite right. But notice: no pointers there! And it doesn't seem to suffer from the lack. I really don't know enough about uniqueness types to evaluate them. But again, I already have my proposed solution: variables but no first-class variables. (And yes, I'm acutely aware of how problematic that makes closures. :-() I'm not yet convinced that this is any different from a language with standard pointer aliasing. If I have two tables A and B, and a foreign key from A into B, then I run into the same problems with enforcing constraints that I would see with a pointer model... when I update a relation, I need to potentially check every other relation that contains a foreign key into it, in order to ensure that its constraints are not violated by that constraint. That's the same thing that is being pointed out as a negative consequence of aliasing in other languages. No, that's not the same thing. What you are describing here is not an aliasing issue, but simply the consequences of allowing constraints to mention more than one variable. A foreign key constraint is a multi-variable constraint. Specifically, a foreign key from table A, attribute a to table B, attribute b is the constraint: forall a in A, exists b in B such that a = b. Note that two variables, A and B, are referenced in the constraint. There's confusion here coming from different usages of the word variable. Let us talk instead of values, and of the abstract structures that gives them meaning. I don't think that will work. We cannot discuss constraints, nor aliasing, without bringing variables in to the picture. (Aliasing may exist without variables but it is a non-problem then.) In both cases (invariants in a hypothetical imperative language, and in a relational database), the constraints make reference to these structures of values (relations, for example, or various kinds of data structures), and not to the individual values or objects that they contain. I am not certain what this means. In both cases, the problem is not that we don't know what structures to check to verify the invariant; rather, it's that we have to check ALL of the values in that structure. But not all values in all structures, as you do in the presence of aliasing. As someone pointed out, this is to be expected in a world of mutable things with identity that are globally locatable. It is simple fact that if I tell you I spoke to Barbara's husband, you may need to trace down who Barbara's husband is before you could discover that, for example, maybe I actually spoke to your boss, or to your nephew's best- friend's father. If databases are capable of modeling these kinds of relationships (and of course they are), then they are as susceptible to aliasing -- in a logical sense that avoids mention of pointer -- as anyone else. I don't see that they're the same thing. Similar is some ways, yes, but not the same. As I said much earlier, the terminology is problematic, and it may be that we (or just I) simply lack the precision of terms necessary for the conversation. Marshall -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall schrieb: What about my example of SQL? Mutation, no pointers, no aliasing. Yet: useful. Sorry, but SQL does have aliasing. E.g. if you have records that have name=John, surname=Doe, the statements SELECT * FROM persons WHERE name = John and SELECT * FROM persons WHERE name = Doe are aliases of each other. The alias is actually in the WHERE clause. And this *can* get you into trouble if you have something that does UPDATE ... WHERE name = John and UPDATE ... WHERE surname = Doe e.g. doing something with the Johns, then updating the names of all Does, and finally restoring the Johns (but not recognizing that changing the names of all Does might have changed your set of Johns). Conceptually, this is just the same as having two different access path to the same memory cell. Or accessing the same global variable through a call-by-reference parameter and via its global name. BTW with views, you get not just aliasing but what makes aliasing really dangerous. Without views, you can simply survey all the queries that you are working with and lexically compare table and field names to see whether there's aliasing. With views, the names that you see in a lexical scope are not enough to determine aliasing. E.g. if you use a view that builds upon the set of Johns but aren't aware of that (possibly due to abstraction barriers), and you change the name field of all Does, then you're altering the view without a chance to locally catch the bug. That's just as bad as with any pointer aliasing problem. Regards, Jo -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Darren New wrote: Andreas Rossberg wrote: Yes, technically you are right. But this makes a pretty weak notion of mutability. All stateful data structures had to stay within their lexical scope, and could never be passed to a function. Not really. The way Hermes handles this is with destructive assignment. Each variable holds a value, and you (almost) cannot have multiple variables referring to the same value. OK, this is interesting. I don't know Hermes, is this sort of like a dynamically checked equivalent of linear or uniqueness typing? If you want to assign Y to X, you use X := Y after which Y is considered to be uninitialized. If you want X and Y to have the same value, you use X := copy of Y after which X and Y have the same value but are otherwise unrelated, and changes to one don't affect the other. Mh, but if I understand correctly, this seems to require performing a deep copy - which is well-known to be problematic, and particularly breaks all kinds of abstractions. Not to mention the issue with uninitialized variables that I would expect occuring all over the place. So unless I'm misunderstanding something, this feels like trading one evil for an even greater one. - Andreas -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall wrote: After all, what are the alternatives? Purely-functional languages remove themselves from a large class of problems that I consider important: data management. Maybe, but I have yet to see how second-class variables are really more adequate in dealing with it. And note that even with second-class state you can still have aliasing issues - you just need mutable arrays and pass around indices. Keys in databases are a more general form of the same problem. I have explored the OO path to its bitter end and am convinced it is not the way. So what is left? Uniqueness types and logic programming, I suppose. I enjoy logic programming but it doesn't seem quite right. But notice: no pointers there! And it doesn't seem to suffer from the lack. Uh, aliasing all over the place! Actually, I think that logic programming, usually based on deep unification, brings by far the worst incarnation of aliasing issues to the table. - Andreas -- Andreas Rossberg, [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall schrieb: void foo() { int i = 0; int j = 0; j = 1; i = 2; // check value of j here. It is still 1, no matter what you filled // in above. // The assignment to i cannot be made to affect the value of j. } Those two local primitive variables cannot be made to have the same identity. Sure. To state it more clearly, they cannot be aliases. But you can update them, so this is an example of mutability without the possibility of identity. You're being a bit sloppy with terminology here. Identity in the phrase above refers to two entities, in the sense of i and j cannot be identical. Identity is actually a property of a single entity, namely that what remains constant regardless of what you do with the entity. Regards, Jo -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall schrieb: By your definition, pointer and variable are synonyms. That doesn't seem like a good idea to me. (What if i and j end up in registers? I have not heard it said that registers have addresses.) There are no registers in the JVM ;-P More specifically, where Joe said pointer he really meant reference. i refers to a variable, and you really want to make sure that the first i refers to the same variable as the second i, so whatever is referred to by i is really an identity. Those two local primitive variables cannot be made to have the same identity. But you can update them, so this is an example of mutability without the possibility of identity. The identity is temporal: You use the same variable name at two different times. Do you intend for the second `i' to mean the same variable as the first `i'? Okay, so i and i have the same identity. Vacuous but true. I suppose you could argue that the language's namespace is an address-space, and variable names are addresses. Correct. At this point, though, you've made the concept of identity so broad that it is now necessarily a property of all languages that use named values, whether updatable or not. I think you would also have to call 3 a void pointer by this scheme. It's not a void pointer, it's a pointer to an integer, but you're correct: 3, when written in a program text, is a reference to the successor of the successor of the successor of zero. If you find that silly and would like to stick with equality, then you're entirely correct. Natural numbers are entirely immutable by definition, and I'm definitely not the first one to observe that identity and equality are indistinguishable for immutable objects. Clearly there is *some* difference between a language which allows explicit pointers and thus aliasing and one that doesn't. You can have aliasing without pointers; e.g. arrays are fully sufficient. If i = j, then a [i] and a [j] are aliases of the same object. (After I observed that, I found it no longer a surprise that array optimizations are what Fortran compiler teams sink most time into. The aliasing problems are *exactly* the same as those with pointers in C - though in practice, the Fortranistas have the advantage that the compiler will usually know that a [i] and b [j] cannot be aliases of each other, so while they have the same problems, the solutions give the Fortranistas more leverage.) What term would you use? First-class variables? I think it's more a quasi-continuous spectrum. On one side, we have alias-free toy languages (no arrays, no pointers, no call-by-reference, just to name the most common sources of aliasing). SQL is slightly more dangerous: it does have aliases, but they are rarely a problem (mostly because views aren't a very common thing, and even if they are used, they aren't usually so abstract that they really hide the underlying dependencies). Next are languages with arrays and call-by-reference. Pascal without pointers would be a candidate. Here, aliasing occurs, and it can and does hide behind abstraction barriers, but it's not a serious problem unless the software becomes *really* large. The end of the line are languages that use references to mutable data everywhere. All OO languages are a typical example of that. Regards, Jo -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall schrieb: Joachim Durchholz wrote: It's just that I know that it's viable to give up destructive updates. Giving up pointers is a far more massive restriction. Oddly, I feel the opposite. While it's true there are many domains for which purely functional programming is a fine choice, there are some domains for which it is insufficient. Any kind of data managament, for example, requires that you be able to update the information. Sure, the data itself cannot be managed non-destructively. However, the programs that operate on the data need not do destructive updates internally. That way, I can have pointers inside the programs without giving up aliasing. (Aliasing is really, really, really important. It's the reason why mathematicians can handle infinite objects - they copy just the alias, i.e. the name or description, instead of the object itself. It's one of the things that make abstraction useful.) Right. To me the response to this clear: give up pointers. Imperative operations are too useful to give up; indeed they are a requirement for certain problems. I don't know any. In some cases, you need an additional level of conceptual indirection - instead of *doing* the updates, you write a function that *describes* them. But then what do you do with that function? I pass it to an engine that's imperative. However, that engine doesn't need to do aliasing anymore. In other words, I'm separating mutability and aliasing, so that they can't combine and explode. Let's say I have an employee database. John Smith just got hired on 1/1/2006 with a salary of $10,000. I need to record this fact somewhere. How do I do that without variables? Current-employees is a variable. Even if I have the space to keep all historical data, so I'm not deleting anything, I still have to have a variable for the latest version of the accumulated data. I can solve this without pointers, but I can't solve it without variables. As I said elsewhere, the record has an identity even though it isn't explicit in SQL. (SQL's data manipulation language is generally not considered half as clean as the query language; I think the core of the problemsis that DML combines mutability and identity.) I should like to learn more about these. I have some vague perception of the existence of linear logic, but not much else. However, I also already have an excellent solution to the pointer aliasing problem, so I'm less motivated. Pointers are just the most obvious form of aliasing. As I said elsewhere, there are others: array indexing, by-reference parameters, or even WHERE clauses in SQL statements. In fact, anything that selects an updatable piece of data from a collection is an alias, and incurs the same problems as pointers, though they may come in different disguises. Actually SQL has references - they are called primary keys, but they are references nevertheless. I strongly object; this is quite incorrect. I grant you that from the 50,000 foot level they appear identical, but they are not. To qualify as a reference, there need to be reference and dereference operations on the reference datatype; there is no such operation is SQL. Would you say the relational algebra has references? Yes. Or, consider the classic prolog ancestor query. Let's say we're setting up as follows father(bob, joe). father(joe, john). Is joe a reference, here? After all, when we do the ancestor query for john, we'll see his father is joe and then use that to find joe's father. Keys in SQL are isomorphic to joe in the above prolog. Agreed. They are both references ;-P (Some SQL dialects also offer synthetic ID fields that are guaranteed to remain stable over the lifetime of a record. Primary keys are updatable; there is nothing special about them. Right - I was wrong with identifying keys and identities. In fact the identity of an SQL record cannot be directly obtained (unless via those ID fields). The records still have identities. It's possible to have two WHERE clauses that refer to the same record, and if you update the record using one WHERE clause, the record returned by the other WHERE clause will have changed, too. With a repeatable read isolation level, you actually return to a declarative view of the database: whatever you do with it, you won't see it until you commit the transaction. (As soon as you commit, the declarative peace is over and you better watch out that your data doesn't become inconsistent due to aliasing.) Alas, transaction isolation levels are a performance hack. I cannot defend them on any logical basis. (Aside: did you mean serializable, rather than repeatable read?) Possibly. There are so many isolation levels that I have to look them up whenever I want to get the terminology 100% correct. The only thing that can really be done about it is not adding it artificially into a program. In those cases where aliasing is part of the
Re: What is a type error?
Joachim Durchholz wrote: Marshall schrieb: What about my example of SQL? Mutation, no pointers, no aliasing. Yet: useful. Sorry, but SQL does have aliasing. Well. I suppose we do not have an agreed upon definition of aliasing, so it is hard to evaluate either way. I would propose using the same one you used for identity: if there are two variables and modifying one also modifies the other, then there is aliasing between them. I avoided mentioning equality to include, for example, having an array i that is an alias to a subset of array j. Please feel free to critque this definition, and/or supply an alternative. E.g. if you have records that have name=John, surname=Doe, the statements SELECT * FROM persons WHERE name = John and SELECT * FROM persons WHERE name = Doe are aliases of each other. The alias is actually in the WHERE clause. Not by my definition, because there is only one variable here. And this *can* get you into trouble if you have something that does UPDATE ... WHERE name = John and UPDATE ... WHERE surname = Doe e.g. doing something with the Johns, then updating the names of all Does, and finally restoring the Johns (but not recognizing that changing the names of all Does might have changed your set of Johns). The fact that some person might get confused about the semantics of what they are doing does not indicate aliasing. It is easy enough to do an analysis of your updates and understand what will happen; this same analysis is impossible with two arbitrary pointers, unless one has a whole-program trace. That strikes me as a significant difference. Conceptually, this is just the same as having two different access path to the same memory cell. Or accessing the same global variable through a call-by-reference parameter and via its global name. There are similarities, but they are not the same. BTW with views, you get not just aliasing but what makes aliasing really dangerous. Without views, you can simply survey all the queries that you are working with and lexically compare table and field names to see whether there's aliasing. With views, the names that you see in a lexical scope are not enough to determine aliasing. E.g. if you use a view that builds upon the set of Johns but aren't aware of that (possibly due to abstraction barriers), and you change the name field of all Does, then you're altering the view without a chance to locally catch the bug. That's just as bad as with any pointer aliasing problem. It is certainly aliasing, but I would not call it just as bad. There are elaborate declarative constraint mechanisms in place, for example. And the definition of the view itsef is a declarative, semantic entity. Whereas a pointer is an opaque, semantics-free address. Marshall -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Andreas Rossberg wrote: Marshall wrote: After all, what are the alternatives? Purely-functional languages remove themselves from a large class of problems that I consider important: data management. Maybe, but I have yet to see how second-class variables are really more adequate in dealing with it. And note that even with second-class state you can still have aliasing issues - you just need mutable arrays and pass around indices. Keys in databases are a more general form of the same problem. So for array a, you would claim that a[5] is an alias for (a part of) a? That seems to stretch the idea of aliasing to me. With these two expressions, it is obvious enough what is going on; with two arbitrary pointers, it is not. It seems to me the source of the problem is the opacity of pointers, but perhaps I am missing something. I have explored the OO path to its bitter end and am convinced it is not the way. So what is left? Uniqueness types and logic programming, I suppose. I enjoy logic programming but it doesn't seem quite right. But notice: no pointers there! And it doesn't seem to suffer from the lack. Uh, aliasing all over the place! Actually, I think that logic programming, usually based on deep unification, brings by far the worst incarnation of aliasing issues to the table. Hmmm. Can you elaborate just a bit? Marshall -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Joachim Durchholz wrote: Marshall schrieb: void foo() { int i = 0; int j = 0; j = 1; i = 2; // check value of j here. It is still 1, no matter what you filled // in above. // The assignment to i cannot be made to affect the value of j. } Those two local primitive variables cannot be made to have the same identity. Sure. To state it more clearly, they cannot be aliases. Yes. But you can update them, so this is an example of mutability without the possibility of identity. You're being a bit sloppy with terminology here. Identity in the phrase above refers to two entities, in the sense of i and j cannot be identical. Identity is actually a property of a single entity, namely that what remains constant regardless of what you do with the entity. It is a fair critque. I'll try to be more precise. Marshall -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Joachim Durchholz wrote: You can have aliasing without pointers; e.g. arrays are fully sufficient. If i = j, then a [i] and a [j] are aliases of the same object. I am having a hard time with this very broad definition of aliasing. Would we also say that a[1+1] and a[2] are aliases? It seems to me, above, that we have only a, and with only one variable there can be no aliasing. A further question: given a 32 bit integer variable x, and offsets i and j (equal as in the above example) would you say that x = (1 i) and x = (1 j) are aliased expressions for setting a particular bit in x? I am not being facetious; I am trying to understand the limits of your definition for aliasing. (After I observed that, I found it no longer a surprise that array optimizations are what Fortran compiler teams sink most time into. The aliasing problems are *exactly* the same as those with pointers in C - though in practice, the Fortranistas have the advantage that the compiler will usually know that a [i] and b [j] cannot be aliases of each other, so while they have the same problems, the solutions give the Fortranistas more leverage.) I don't understand this paragraph. On the one hand, it seems you are saying that C and Fortran are identically burdened with the troubles caused by aliasing, and then a moment later it seems you are saying the situation is distinctly better with Fortran. What term would you use? First-class variables? I think it's more a quasi-continuous spectrum. On one side, we have alias-free toy languages (no arrays, no pointers, no call-by-reference, just to name the most common sources of aliasing). SQL is slightly more dangerous: it does have aliases, but they are rarely a problem (mostly because views aren't a very common thing, and even if they are used, they aren't usually so abstract that they really hide the underlying dependencies). Next are languages with arrays and call-by-reference. Pascal without pointers would be a candidate. Here, aliasing occurs, and it can and does hide behind abstraction barriers, but it's not a serious problem unless the software becomes *really* large. The end of the line are languages that use references to mutable data everywhere. All OO languages are a typical example of that. Now with this, it appears you are agreeing that SQL has an advantage vis-a-vis aliasing compared to OO languages. Yes? If so, we are agreeing on the part I care about, and the specifics of just what we call aliasing are not so important to me. Marshall -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall wrote: Andreas Rossberg wrote: And note that even with second-class state you can still have aliasing issues - you just need mutable arrays and pass around indices. Keys in databases are a more general form of the same problem. So for array a, you would claim that a[5] is an alias for (a part of) a? That seems to stretch the idea of aliasing to me. Not at all, I'd say. In my book, aliasing occurs whenever you have two different lvalues that denote the same mutable cell. I think it would be rather artificial to restrict the definition to lvalues that happen to be variables or dereferenced pointers. So of course, a[i] aliases a[j] when i=j, which in fact is a well-known problem in some array or matrix routines (e.g. copying array slices must always consider overlapping slices as special cases). Uh, aliasing all over the place! Actually, I think that logic programming, usually based on deep unification, brings by far the worst incarnation of aliasing issues to the table. Hmmm. Can you elaborate just a bit? Logic variables immediately become aliases when you unify them. Afterwards, after you bind one, you also bind the other - or fail, because both got already bound the other way round. Unfortunately, unification also is a deep operation, that is, aliasing can be induced into variables deeply nested into some terms that happen to get unified, possibly without you being aware of any of this (the unification, nor the variable containment). To avoid accidental aliasing you essentially must keep track of all potential unifications performed by any given predicate (including transitively, by its subgoals), which totally runs square to all concerns of modularity and abstraction. I've never been much of a logic programmer myself, but our language group has a dedicated LP and CP background, and people here have developed very strong opinions about the adequateness of unrestricted logic variables and particularly unification in practice. I remember somebody calling Prolog the worst imperative language ever invented. - Andreas -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Andreas Rossberg wrote: OK, this is interesting. I don't know Hermes, is this sort of like a dynamically checked equivalent of linear or uniqueness typing? I'm not sure what linear or uniqueness typing is. It's typestate, and if I remember correctly the papers I read 10 years ago, the folks at TJWatson that invented Hermes also invented the concept of typestate. They at least claim to have coined the term. It's essentially a dataflow analysis that allows you to do the same sorts of things that don't read variables that may not yet have been assigned to, except that you could annotate that variables change to the state of uninitialized after they've already been initialized. Mh, but if I understand correctly, this seems to require performing a deep copy - which is well-known to be problematic, and particularly breaks all kinds of abstractions. Um, no, because there are no aliases. There's only one name for any given value, so there's no deep copy problems. A deep copy and a shallow copy are the same thing. And there are types of values you can assign but not copy, such as callmessages (which are problematic to copy for the same reason a stack frame would be problematic to copy). I believe, internally, that there were cases where the copy was deep and cases where it was shallow, depending on the surrounding code. Making a copy of a table and passing it to another process had to be a deep copy (given that a column could contain another table, for example). Making a copy of a table and using it for read-only purposes in the same process would likely make a shallow copy of the table. Iterating over a table and making changes during the iteration made a copy-on-write subtable, then merged it back into the original table when it was done the loop, since the high-level semantic definition of looping over a table is that you iterate over a copy of the table. The only thing close to aliases are references to some other process's input ports (i.e., multiple client-side sockets connected to a server-side socket). If you want to share data (such as a file system or program library), you put the data in a table in a process, and you hand out client-side connections to the process. Mostly, you'd define such connections as accepting a data value (for the file contents) with the parameter being undefined upon return from the call, and the file name as being read-only, for example. If you wanted to store the file, you could just pass a pointer to its data (in the implementation). If you wanted a copy of it, you would either copy it and pass the pointer, or you'd pass the pointer with a flag indicating it's copy-on-write, or you could pass the pointer and have the caller copy it at some point before returning, depending on what the caller did with it. The semantics were high-level with the intent to allow the compiler lots of leeway in implementation, not unlike SQL. Not to mention the issue with uninitialized variables that I would expect occuring all over the place. The typestate tracks this, and prevents you from using uninitialized variables. If you do a read (say, from a socket) and it throws an end of data exception, the compiler prevents you from using the buffer you just tried but failed to read. Indeed, that's the very point of it all. By doing this, you can run untrusted code in the same address space as trusted code, and be assured that the compiler will prevent the untrusted code from messing up the trusted code. The predecessor of Hermes (NIL) was designed to let IBM's customers write efficient networking code and emulations and such that ran in IBM's routers, without the need for expensive (in performance or money) hardware yet with the safety that they couldn't screw up IBM's code and hence cause customer service problems. So unless I'm misunderstanding something, this feels like trading one evil for an even greater one. In truth, it was pretty annoying. But more because you wound up having to write extensive declarations and compile the declarations before compiling the code that implements them and such. That you didn't get to use uninitialized variables was a relatively minor thing, especially given that many languages nowadays complain about uninitialized variables, dead code, etc. But for lots of types of programs, it let you do all kinds of things with a good assurance that they'd work safely and efficiently. It was really a language for writing operating systems in, when you get right down to it. -- Darren New / San Diego, CA, USA (PST) This octopus isn't tasty. Too many tentacles, not enough chops. -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall wrote: Joe Marshall wrote: Marshall wrote: Consider the following Java fragment: void foo() { int i = 0; int j = 0; // put any code here you want j = 1; i = 2; // check value of j here. It is still 1, no matter what you filled in above. // The assignment to i cannot be made to affect the value of j. } True, but you have hidden the pointers. Semantically, the identifiers i and j refer not to integers but to locations that hold integers. The assignment modifies the location. What the implementation looks like shouldn't affect how we speak of the logical model. In the above code, there are no pointers. By your definition, pointer and variable are synonyms. That doesn't seem like a good idea to me. (What if i and j end up in registers? I have not heard it said that registers have addresses.) You probably won't like this, but The example you give above is a bit ambiguous as to whether or not it shows `true' mutation. So what do I mean by `true' mutation? The code above could be transformed by alpha-renaming into an equivalent version that does no mutation: void foo() { int i = 0; int j = 0; // put any code here you want int jj = 1; int ii = 2; // rename all uses of i to ii and j to jj after this point. } Once I've done the renaming, we're left with a pure function. I claim that this sort of `mutation' is uninteresting because it can be trivially eliminated through renaming. Since we can eliminate it through renaming, we can use a simplified semantics that does not involve a `store', and use a substitution model of evaluation. The more interesting kind of mutation cannot be eliminated through renaming. The semantic model requires the notion of a stateful `store', and the model is not referentially transparent: substitution does not work. These qualities are what make mutation problematic. My claim is that this kind of mutation cannot be had without some model of pointers and references. There is a further distinguishing characteristic: Can I write a program that detects the mutation (and acts differently depending on whether there is mutation or not)? It is unclear from your example above whether you intended this to be possible, but certainly there is no way for a caller of foo to tell that foo reassigns an internal variable. If there a procedure is extrinsically a pure function, then there is no real meaning to any `mutation' that goes on inside; there is always a way to turn the internal mutation into a pure function through alpha-renaming. Clearly there is *some* difference between a language which allows explicit pointers and thus aliasing and one that doesn't. What term would you use? First-class variables? My argument to you in re this statement: Again, I disagree: it is posible to have mutability without pointers/identity/objects. is that mutability without a notion of pointers, identity, or objects is trivially eliminated through alpha renaming and thus isn't the `mutability' of interest. -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall schrieb: Joachim Durchholz wrote: Marshall schrieb: What about my example of SQL? Mutation, no pointers, no aliasing. Yet: useful. Sorry, but SQL does have aliasing. Well. I suppose we do not have an agreed upon definition of aliasing, so it is hard to evaluate either way. I would propose using the same one you used for identity: if there are two variables and modifying one also modifies the other, then there is aliasing between them. I think that's an adequate example. For a definition, I'd say it's a bit too narrow unless we use a fairly broad definition for variable. I.e. in a C context, I'd say that a-b is a variable, too, as would be foo(blah)-x. I avoided mentioning equality to include, for example, having an array i that is an alias to a subset of array j. This would mean that there's aliasing between, say, a list of transaction records and the balance of the account (since modifying the list of transactions will change the balance, unless the object isn't properly encapsulated). For purposes of this discussion, it's probably fair to say that's a form of aliasing, too, even though it's quite indirect. E.g. if you have records that have name=John, surname=Doe, the statements SELECT * FROM persons WHERE name = John and SELECT * FROM persons WHERE name = Doe are aliases of each other. Argh... I made a most braindamaged, stupid mistake here. The second SELECT should have used the *surname* field, so it would be SELECT * FROM persons WHERE surname = Doe Then, if there's a record that has name = John, surname = Doe, the two WHERE clauses have aliasing in the form of overlapping result sets. The alias is actually in the WHERE clause. Not by my definition, because there is only one variable here. Sorry, my wording was sloppy. I meant to say that 'in SQL, you identify records via clauses like WHERE name = John, i.e. WHERE clauses are a kind of identity'. This is still not precise - the identity of an SQL record isn't explicitly accessible (except the nonstandard OID facility that some SQL engines offer). Nevertheless, they do have an identity, and there's a possibility of aliasing - if you change all Johns, you may also change a Doe. And this *can* get you into trouble if you have something that does UPDATE ... WHERE name = John and UPDATE ... WHERE surname = Doe e.g. doing something with the Johns, then updating the names of all Does, and finally restoring the Johns (but not recognizing that changing the names of all Does might have changed your set of Johns). The fact that some person might get confused about the semantics of what they are doing does not indicate aliasing. It is easy enough to do an analysis of your updates and understand what will happen; this same analysis is impossible with two arbitrary pointers, unless one has a whole-program trace. That strikes me as a significant difference. Sure. I said that aliases in SQL aren't as bad as in other programs. Once you get abstraction mixed in, the analysis becomes less straightforward. In the case of SQL, views are such an abstraction facility, and they indeed can obscure what you're doing and make the analysis more difficult. If it's just SQL we're talking about, you indeed have to look at the whole SQL to check whether there's a view that may be involved in the queries you're analysing, so the situation isn't *that* different from pointers - it's just not a problem because the amount of code is so tiny! Conceptually, this is just the same as having two different access path to the same memory cell. Or accessing the same global variable through a call-by-reference parameter and via its global name. There are similarities, but they are not the same. What are the relevant differences? How does the semantics of a WHERE clause differ from that of a pointer, in terms of potential aliasing? My line of argument would be this: Pointers can be simulated using arrays, so it's no surprise that arrays can emulate all the aliasing of pointers. Arrays can be simulated using WHERE (with SELECT and UPDATE), so I'd say that SQL can emulate all the aliasing of arrays, and hence that of pointers. BTW with views, you get not just aliasing but what makes aliasing really dangerous. Without views, you can simply survey all the queries that you are working with and lexically compare table and field names to see whether there's aliasing. With views, the names that you see in a lexical scope are not enough to determine aliasing. E.g. if you use a view that builds upon the set of Johns but aren't aware of that (possibly due to abstraction barriers), and you change the name field of all Does, then you're altering the view without a chance to locally catch the bug. That's just as bad as with any pointer aliasing problem. It is certainly aliasing, but I would not call it just as bad. There are elaborate declarative constraint mechanisms in
Re: What is a type error?
Marshall schrieb: Joachim Durchholz wrote: You can have aliasing without pointers; e.g. arrays are fully sufficient. If i = j, then a [i] and a [j] are aliases of the same object. I am having a hard time with this very broad definition of aliasing. Would we also say that a[1+1] and a[2] are aliases? It seems to me, above, that we have only a, and with only one variable there can be no aliasing. a[1] is a variable, too. You can assign values to it. Inside function foo when called as foo(a[1]), it may even be referenced with yet another name and updated in isolation; foo may not even know it's an array element. (This is sort-of true in C, and definitely true in languages that don't have pointer arithmetic. If you pass a[1] to a call-by-reference parameter in Pascal, the callee has no way to access the parameter as if it were an array element.) A further question: given a 32 bit integer variable x, and offsets i and j (equal as in the above example) would you say that x = (1 i) and x = (1 j) are aliased expressions for setting a particular bit in x? Yes. I am not being facetious; I am trying to understand the limits of your definition for aliasing. Understood and appreciated. (After I observed that, I found it no longer a surprise that array optimizations are what Fortran compiler teams sink most time into. The aliasing problems are *exactly* the same as those with pointers in C - though in practice, the Fortranistas have the advantage that the compiler will usually know that a [i] and b [j] cannot be aliases of each other, so while they have the same problems, the solutions give the Fortranistas more leverage.) I don't understand this paragraph. On the one hand, it seems you are saying that C and Fortran are identically burdened with the troubles caused by aliasing, and then a moment later it seems you are saying the situation is distinctly better with Fortran. That's exactly the situation. Aliasing is present in both language, but in Fortran, the guarantee that two names aren't aliased are easier to come by. Even more non-aliasing guarantees exist if the language has a more expressive type system. In most cases, if you know that two expressions have different types, you can infer that they cannot be aliases. Of course, the life of compiler writers if of less importance to us; I added them only because the reports of Fortran compilers having aliasing problems due to array indexes made me aware that pointers aren't the only source of aliasing. That was quite a surprise to me then. Now with this, it appears you are agreeing that SQL has an advantage vis-a-vis aliasing compared to OO languages. Yes? Sure. I think that making an aliasing bug in SQL is just as easy as in any pointer-ridden language, but transactions and constraints (plus the considerably smaller typical size of SQL code) make it far easier to diagnose any such bugs. If so, we are agreeing on the part I care about, and the specifics of just what we call aliasing are not so important to me. I'm not sure whether I'm getting the same mileage out of this, but the relevance of a problem is obviously a function on what one is doing on a day-to-day basis, so I can readily agree with that :-) Regards, Jo -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall wrote... I am having a hard time with this very broad definition of aliasing. Would we also say that a[1+1] and a[2] are aliases? It seems to me, above, that we have only a, and with only one variable there can be no aliasing. The problem with this (and with the relational one as well) is that the practical problems don't go away by redefining variable to mean larger collections of values. given a 32 bit integer variable x, and offsets i and j (equal as in the above example) would you say that x = (1 i) and x = (1 j) are aliased expressions for setting a particular bit in x? I don't know about Joachim, but yes, I would. -- Chris Smith - Lead Software Developer / Technical Trainer MindIQ Corporation -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall wrote: Okay, sure. But for the problem you describe, both imperativeness and the presence of pointers is each necessary but not sufficient; it is the two together that causes the problem. So it strikes me (again, a very minor point) as inaccurate to describe this as a problem with imperative languages per se. [...] Right. To me the response to this clear: give up pointers. Imperative operations are too useful to give up; indeed they are a requirement for certain problems. Pointers on the other hand add nothing except efficiency and a lot of confusion. They should be considered an implementation technique only, hidden behind some pointerless computational model. Don't get yourself distracted by the low-level notion of pointer. The problem *really* is mutability and the associated notion of identity, which explicit pointers just exhibit on a very low level. When you have a language with mutable types (e.g. mutable arrays) then objects of these types have identity, which is observable through assignment. This is regardless of whether identity is an explicit concept (like it becomes with pointers and comparison of pointer values, i.e. addresses). Consequently, you cannot possibly get rid of aliasing issues without getting rid of (unrestricted) mutability. Mutability implies object identity implies aliasing problems. On the other hand, pointers are totally a futile concept without mutability: if everything is immutable, it is useless to distinguish between an object and a pointer to it. In other words, pointers are essentially just an *aspect* of mutability in lower-level languages. On a sufficiently high level of abstraction, it does not make much sense to differentiate between both concepts - pointers are just mutable objects holding other mutable objects (immutable pointer types exist, but are only interesting if you also have pointer arithmetics - which, however, is largely equivalent to arrays, i.e. not particularly relevant either). - Andreas -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall schrieb: Joachim Durchholz wrote: Marshall schrieb: Joachim Durchholz wrote: Marshall schrieb: I can see the lack of a formal model being an issue, but is the imperative bit really all that much of an obstacle? How hard is it really to deal with assignment? Or does the issue have more to do with pointers, aliasing, etc.? Actually aliasing is *the* hard issue. Okay, sure. Nice explanation. But one minor point: you describe this as an issue with imperative languages. But aliasing is a problem associated with pointers, not with assignment. Aliasing is not a problem if the aliased data is immutable. Okay, sure. But for the problem you describe, both imperativeness and the presence of pointers is each necessary but not sufficient; it is the two together that causes the problem. So it strikes me (again, a very minor point) as inaccurate to describe this as a problem with imperative languages per se. Sure. It's just that I know that it's viable to give up destructive updates. Giving up pointers is a far more massive restriction. Right. To me the response to this clear: give up pointers. Imperative operations are too useful to give up; indeed they are a requirement for certain problems. I don't know any. In some cases, you need an additional level of conceptual indirection - instead of *doing* the updates, you write a function that *describes* them. Pointers on the other hand add nothing except efficiency and a lot of confusion. They should be considered an implementation technique only, hidden behind some pointerless computational model. I recognize that this is likely to be a controversial opinion. Indeed. In fact modes are a way to restrict pointer aliasing. I heartily support immutability as the default, for these and other reasons. OK, then we're in agreement here. Some functional languages restrict assignment so that there can exist at most a single reference to any mutable data structure. That way, there's still no aliasing problems, but you can still update in place where it's really, really necessary. Are we speaking of uniqueness types now? I haven't read much about them, but it certainly seems like an intriguing concept. Yup. It's called modes in some other languages (Mercury or Clean IIRC). I know of no professional language that doesn't have references of some kind. Ah, well. I suppose I could mention prolog or mercury, but then you used that troublesome word professional. So I will instead mention a language which, if one goes by number of search results on hotjobs.com for xxx progammer for different value of xxx, is more popular than Java and twice as popular as C++. It lacks pointers (although I understand they are beginning to creep in in the latest version of the standard.) It also posesses a quite sophisticated set of facilities for declarative integrity constraints. Yet for some reason it is typically ignored by language designers. http://hotjobs.yahoo.com/jobseeker/jobsearch/search_results.html?keywords_all=sql+programmer Oh, right. SQL is an interesting case of getting all the problems of pointers without having them ;-) Actually SQL has references - they are called primary keys, but they are references nevertheless. (Some SQL dialects also offer synthetic ID fields that are guaranteed to remain stable over the lifetime of a record. Seems like SQL is imperative enough that programmers want this, else the SQL vendors wouldn't have added the feature...) SQL also has updates. The result: updates with undefined semantics. E.g. if you have a numeric key field, UPDATE commands that increment the key by 1 will fail or work depending on the (unspecified) order in which UPDATE touches the records. You can have even more fun with updatable views. With a repeatable read isolation level, you actually return to a declarative view of the database: whatever you do with it, you won't see it until you commit the transaction. (As soon as you commit, the declarative peace is over and you better watch out that your data doesn't become inconsistent due to aliasing.) Aliasing isn't really related to specific programming practices. If two accountants chat, and one talks about the hot blonde secretaire and the other about his adorable wife, you can imagine the awkwardness that ensues as soon as they find out they're talking about the same person! The only thing that can really be done about it is not adding it artificially into a program. In those cases where aliasing is part of the modelled domain, you really have to carefully inspect all interactions and never, never, never dream about abstracting it away. Regards, Jo -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Darren New schrieb: Joachim Durchholz wrote: Actually, in a functional programming language (FPL), you write just the postconditions and let the compiler generate the code for you. Certainly. And my point is that the postcondition describing all valid chess boards reachable from this one is pretty much going to be as big as an implementation for generating it, yes? Yes. It's a classical case where the postcondition and the code that fulfils it are essentially the same. The postcondition will still have to contain all the rules of chess in it, for example. At best you've replaced loops with some sort of universal quanitifier with a such that phrase. Correct. OTOH, isn't that the grail that many people have been searching for: programming by simply declaring the results that they want to see? Anyway, I expect you could prove you can't do this in the general case. Otherwise, you could just write a postcondition that asserts the output of your function is machine code that when run generates the same outputs as the input string would. I.e., you'd have a compiler that can write other compilers, generated automatically from a description of the semantics of the input stream and the semantics of the machine the code is to run on. I'm pretty sure we're not there yet, and I'm pretty sure you start running into the limits of computability if you do that. No, FPLs are actually just that: compilable postconditions. Computability issues aren't more or less a factor than with other kinds of compilers: they do limit what you can do, but these limits are loose enough that you can do really useful stuff within them (in particular, express all algorithms). Regards, Jo -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall schrieb: David Hopwood wrote: This property is, after all, not something that the program should depend on. It is determined by how good the static checker currently is, and we want to be able to improve checkers (and perhaps even allow them to regress slightly in order to simplify them) without changing programs. Hmmm. I have heard that argument before and I'm conflicted. I'd want several things. A way for me to indicate what assertions must be proven statically. Highlighting (be it compiler messages or flashing colors in an IDE) that marks assertions that *will* break. And highlighting for assertions that *may* break. In the language, a (possibly) simplicistic inference engine definition that gives me minimum guarantees about the things that it will be able to prove; if something is out of the reach of the engine, a straightforward way to add intermediate assertions until the inference succeeds. (Plus diagnostics that tell me where the actual error may be, whether it's a bug in the code or an omission in the assertions. That's probably the hardest part of it all.) Regards, Jo -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Andreas Rossberg wrote: Marshall wrote: Okay, sure. But for the problem you describe, both imperativeness and the presence of pointers is each necessary but not sufficient; it is the two together that causes the problem. So it strikes me (again, a very minor point) as inaccurate to describe this as a problem with imperative languages per se. [...] Right. To me the response to this clear: give up pointers. Imperative operations are too useful to give up; indeed they are a requirement for certain problems. Pointers on the other hand add nothing except efficiency and a lot of confusion. They should be considered an implementation technique only, hidden behind some pointerless computational model. Don't get yourself distracted by the low-level notion of pointer. The problem *really* is mutability and the associated notion of identity, which explicit pointers just exhibit on a very low level. When you have a language with mutable types (e.g. mutable arrays) then objects of these types have identity, which is observable through assignment. This is regardless of whether identity is an explicit concept (like it becomes with pointers and comparison of pointer values, i.e. addresses). Hmmm, well, I cannot agree. You've defined away the pointers but then slipped them back in again by assumption (objects of these types have identity.) First let me say that the terminology is somewhat problematic. For the specific issue being discussed here, pointers, identity, and objects are all the same concept. (I agree that pointer connotes a low-level construct, however.) Sometimes I think of this issue as being one with first class variables. An object with mutable fields is a variable, and if we have pointers or references or any way to have two different pathways to that object/those variables, then we run in to the aliasing problem. However if the mutable types are not first class, then there is no way to have the aliasing. Thus, if we do not have pointers or objects or identity but retain mutability, there is no aliasing problem. Consequently, you cannot possibly get rid of aliasing issues without getting rid of (unrestricted) mutability. Mutability implies object identity implies aliasing problems. Mutability by itself does not imply identity. I agree that mutability plus identity implies aliasing problems, however. On the other hand, pointers are totally a futile concept without mutability: if everything is immutable, it is useless to distinguish between an object and a pointer to it. Agreed. In other words, pointers are essentially just an *aspect* of mutability in lower-level languages. Again, I disagree: it is posible to have mutability without pointers/identity/objects. Marshall -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Joachim Durchholz wrote: Marshall schrieb: Joachim Durchholz wrote: Marshall schrieb: Joachim Durchholz wrote: Marshall schrieb: I can see the lack of a formal model being an issue, but is the imperative bit really all that much of an obstacle? How hard is it really to deal with assignment? Or does the issue have more to do with pointers, aliasing, etc.? Actually aliasing is *the* hard issue. Okay, sure. Nice explanation. But one minor point: you describe this as an issue with imperative languages. But aliasing is a problem associated with pointers, not with assignment. Aliasing is not a problem if the aliased data is immutable. Okay, sure. But for the problem you describe, both imperativeness and the presence of pointers is each necessary but not sufficient; it is the two together that causes the problem. So it strikes me (again, a very minor point) as inaccurate to describe this as a problem with imperative languages per se. Sure. It's just that I know that it's viable to give up destructive updates. Giving up pointers is a far more massive restriction. Oddly, I feel the opposite. While it's true there are many domains for which purely functional programming is a fine choice, there are some domains for which it is insufficient. Any kind of data managament, for example, requires that you be able to update the information. On the other hand, there is no problem domain for which pointers are a requirement. I agree they are deucedly convenient, though. Right. To me the response to this clear: give up pointers. Imperative operations are too useful to give up; indeed they are a requirement for certain problems. I don't know any. In some cases, you need an additional level of conceptual indirection - instead of *doing* the updates, you write a function that *describes* them. But then what do you do with that function? Let's say I have an employee database. John Smith just got hired on 1/1/2006 with a salary of $10,000. I need to record this fact somewhere. How do I do that without variables? Current-employees is a variable. Even if I have the space to keep all historical data, so I'm not deleting anything, I still have to have a variable for the latest version of the accumulated data. I can solve this without pointers, but I can't solve it without variables. Pointers on the other hand add nothing except efficiency and a lot of confusion. They should be considered an implementation technique only, hidden behind some pointerless computational model. I recognize that this is likely to be a controversial opinion. Indeed. In fact modes are a way to restrict pointer aliasing. I should like to learn more about these. I have some vague perception of the existence of linear logic, but not much else. However, I also already have an excellent solution to the pointer aliasing problem, so I'm less motivated. I heartily support immutability as the default, for these and other reasons. OK, then we're in agreement here. Some functional languages restrict assignment so that there can exist at most a single reference to any mutable data structure. That way, there's still no aliasing problems, but you can still update in place where it's really, really necessary. Are we speaking of uniqueness types now? I haven't read much about them, but it certainly seems like an intriguing concept. Yup. It's called modes in some other languages (Mercury or Clean IIRC). Cool. I know of no professional language that doesn't have references of some kind. Ah, well. I suppose I could mention prolog or mercury, but then you used that troublesome word professional. So I will instead mention a language which, if one goes by number of search results on hotjobs.com for xxx progammer for different value of xxx, is more popular than Java and twice as popular as C++. It lacks pointers (although I understand they are beginning to creep in in the latest version of the standard.) It also posesses a quite sophisticated set of facilities for declarative integrity constraints. Yet for some reason it is typically ignored by language designers. http://hotjobs.yahoo.com/jobseeker/jobsearch/search_results.html?keywords_all=sql+programmer Oh, right. SQL is an interesting case of getting all the problems of pointers without having them ;-) Oh, pooh. SQL has plenty of problems, sure, but the problems of pointers are not among its faults. Actually SQL has references - they are called primary keys, but they are references nevertheless. I strongly object; this is quite incorrect. I grant you that from the 50,000 foot level they appear identical, but they are not. To qualify as a reference, there need to be reference and dereference operations on the reference datatype; there is no such operation is SQL. Would you say the relational algebra has references? Or, consider the classic prolog ancestor query. Let's say we're
Re: What is a type error?
Marshall wrote: However if the mutable types are not first class, then there is no way to have the aliasing. Thus, if we do not have pointers or objects or identity but retain mutability, there is no aliasing problem. Yes, technically you are right. But this makes a pretty weak notion of mutability. All stateful data structures had to stay within their lexical scope, and could never be passed to a function. For example, this essentially precludes object-oriented programming, because you could not have objects with state (the alternative, second class objects, would be even less objective). Generally, second-classness is an ad-hoc restriction that can work around all kinds of problems, but rarely with satisfactory results. So I would tend to say that this is not an overly interesting point in the design space. But YMMV. In other words, pointers are essentially just an *aspect* of mutability in lower-level languages. Again, I disagree: it is posible to have mutability without pointers/identity/objects. OK, if you prefer: it is an aspect of first-class mutability - which is present in almost all imperative languages I know. :-) - Andreas -- Andreas Rossberg, [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
[Way off-topic] Re: What is a type error?
Joachim Durchholz wrote: A concrete example: the first thing that Windows does when accepting userland data structures is... to copy them; this were unnecessary if the structures were immutable. It is necessary also because the userland structures are in a different privilege domain, and the memory backing them is potentially mapped by other processes, or accessible by other threads of this process, that can be running in parallel while the current thread is in the kernel. Also, it tends to be easier and more secure to treat userland structures as if they were in a different address space (even if for a particular OS, the kernel would be able to access user mode addresses directly). The kernel cannot trust that pointers in userland structures are valid, for instance. So, in order to avoid a copy when accepting a userland structure: - when servicing a kernel call for a given process, the OS must use kernel-mode page table mappings that are a superset of the user-mode mappings for that process at the point of the call. - the memory backing the structure must be immutable in all processes that could map it, and the kernel must trust the means by which it is made immutable. -- David Hopwood [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Andreas Rossberg wrote: Yes, technically you are right. But this makes a pretty weak notion of mutability. All stateful data structures had to stay within their lexical scope, and could never be passed to a function. Not really. The way Hermes handles this is with destructive assignment. Each variable holds a value, and you (almost) cannot have multiple variables referring to the same value. If you want to assign Y to X, you use X := Y after which Y is considered to be uninitialized. If you want X and Y to have the same value, you use X := copy of Y after which X and Y have the same value but are otherwise unrelated, and changes to one don't affect the other. (As almost an aside, the non-scalar data structures were very similar to SQL data tables.) If you declare (the equivalent to) a function, you can indicate whether the paramters matching the arguments are passed destructively, or are read-only, or are copy-in-copy-out. So you could declare a function, for example, that you pass a table into, and if it's marked as a read-only parameter, the compiler ensures the callee does not modify the table and the compiler generates code to pass a pointer. One could also mark a variable (for example) as uninitialized on entry, intialized on return, and uninitalized on the throw of an exception, and this could be used (for example) for the read-a-line-from-a-socket routine. The only value that came close to being shared is an outgoing connection to a procedure; the equivalent of the client side of a socket. For these, you could make copies, and each copy would point to the same receiver. The receiving process could change over time, by passing its end of the socket in a message to some other process (live code upgrading, for example). Since everything could be passed as part of a message, including code, procedures, tables, and inports and outports (the ends of sockets), I don't see that it had any problems with first classness. OK, if you prefer: it is an aspect of first-class mutability - which is present in almost all imperative languages I know. :-) I disagree. It's entirely possible to make sophisticated imperitive languages with assignment and without aliasing. -- Darren New / San Diego, CA, USA (PST) This octopus isn't tasty. Too many tentacles, not enough chops. -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Joachim Durchholz [EMAIL PROTECTED] wrote: OTOH, isn't that the grail that many people have been searching for: programming by simply declaring the results that they want to see? Possibly. No, FPLs are actually just that: compilable postconditions. This seems to me a bit misleading. Perhaps someone will explain why I should stop thinking this way; but currently I classify statements like this in the well, sort of slot of my mind. If functional programming were really just compilable postconditions, then functional programmers would be able to skip a good bit of stuff that they really can't. For example, time and space complexity of code is still entirely relevant for functional programming. I can't simply write: (define fib (lambda (x) (if ( x 2) 1 (+ (fib (- x 1)) (fib (- x 2)) and expect the compiler to create an efficient algorithm for me. This is true even though the above is (the LISP transcription of) the most natural way to describe the fibonacci sequence from a mathematical standpoint. It still runs in exponential time, and it still matters that it runs in exponential time; and LISP programmers adapt their so- called declarative code to improve time bounds all the time. This makes it harder for me to call it declarative (or compilable postconditions) and feel entirely honest. (Yes, I realize that the above could be optimized in a language that does normal order evaluation with common subexpression elimination. and become linear-time. However, that's not true of algorithms in general. It is not the case that all that's needed to find an efficient algorithm for something is to plug it into a Haskell compiler and observe what happens. Or, if that is the case, there are a few CS professors I know who would be quite interested in hearing so.) Computability issues aren't more or less a factor than with other kinds of compilers: they do limit what you can do, but these limits are loose enough that you can do really useful stuff within them (in particular, express all algorithms). Is it really consistent to say that postconditions allow you to express algorithms? -- Chris Smith - Lead Software Developer / Technical Trainer MindIQ Corporation -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall schrieb: Mutability by itself does not imply identity. Well, the implication certainly holds from identity to mutability. The only definition of identity that I found to hold up for all kinds of references (pointers, shared-memory identifiers, URLs etc.) is this: Two pieces of data are identical if and only if: a) they are equal b) they stay equal after applying an arbitrary operation to one of them. This means that for immutable objects, there's no observable difference between equality and identity (which I think it just fine). For the implicaton from mutability to identity, I'm not sure whether talking about mutation still makes sense without some kind of identity. For example, you need to establish that the object after the mutation is still the same in some sense, and this the same concept is exactly identity. I agree that mutability plus identity implies aliasing problems, however. Then we're agreeing about the most important point anyway. In other words, pointers are essentially just an *aspect* of mutability in lower-level languages. Again, I disagree: it is posible to have mutability without pointers/identity/objects. I'm sceptical. Any examples? Regards, Jo -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall [EMAIL PROTECTED] wrote: Hmmm, well, I cannot agree. You've defined away the pointers but then slipped them back in again by assumption (objects of these types have identity.) First let me say that the terminology is somewhat problematic. For the specific issue being discussed here, pointers, identity, and objects are all the same concept. (I agree that pointer connotes a low-level construct, however.) Unless I'm missing your point, I disagree with your disagreement. Mutability only makes sense because of object identity (in the generic sense; no OO going on here). Without object identities, mutability is useless. What's the use of changing something if you're not sure you'll ever be able to find it again? You may limit the scope of object identity arbitrarily, even to the point that aliasing is impossible (though with lexical closure, that gets even more limiting than it may first appear)... but you're just trading off power for simplicity, and the really interesting uses of mutations are those that allow access to specific objects from any number different bits of code, on a program-wide or at least module-wide scope. Most mediocre programmers could replace assignment with recursion if that assignment is limited to local variables of a single subroutine. I don't necessarily agree that the result will be a better program despite others' conviction on the matter; however, the difference certainly isn't worth complicating the language with mutation unless you're willing to allow the interesting uses of mutation as well. Mutability by itself does not imply identity. I agree that mutability plus identity implies aliasing problems, however. We might have a terminological issue, then. I'd tend to say that mutability definitely does imply identity, but identity doesn't imply aliasing. Same difference. -- Chris Smith - Lead Software Developer / Technical Trainer MindIQ Corporation -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Chris Smith wrote: Unless I'm missing your point, I disagree with your disagreement. Mutability only makes sense because of object identity (in the generic sense; no OO going on here). Depends what you mean by object. int x = 6; int y = 5; x = y; I'd say x was mutable, with no identity problems involved? Why is it problematic that variables have identity and are mutable? Certainly I can later find whatever value I put into x. -- Darren New / San Diego, CA, USA (PST) This octopus isn't tasty. Too many tentacles, not enough chops. -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall wrote: Again, I disagree: it is posible to have mutability without pointers/identity/objects. I think you are wrong, but before I make a complete ass out of myself, I have to ask what you mean by `mutability'. (And pointers/identity/objects, for that matter.) Alan Bawden discusses the phenomenon of `state' in his Ph.D. dissertation Implementing Distributed Systems Using Linear Naming. MIT AI Lab Technical Report AITR-1627. March 1993 He makes a persuasive argument that `state' is associated with cycles in naming. -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall wrote: David Hopwood wrote: Marshall wrote: Wouldn't it be possible to do them at compile time? (Although this raises decidability issues.) It is certainly possible to prove statically that some assertions cannot fail. The ESC/Java 2 (http://secure.ucd.ie/products/opensource/ESCJava2/docs.html) tool for JML (http://www.cs.iastate.edu/~leavens/JML/) is one system that does this -- although some limitations and usability problems are described in http://secure.ucd.ie/products/opensource/ESCJava2/ESCTools/papers/CASSIS2004.pdf. I look forward to reading this. I read a paper on JML a while ago and found it quite interesting. Mightn't it also be possible to leave it up to the programmer whether a given contract was compile-time or runtime? That would be possible, but IMHO a better option would be for an IDE to give an indication (by highlighting, for example), which contracts are dynamically checked and which are static. This property is, after all, not something that the program should depend on. It is determined by how good the static checker currently is, and we want to be able to improve checkers (and perhaps even allow them to regress slightly in order to simplify them .. or improve their performance .. ) without changing programs. Hmmm. I have heard that argument before and I'm conflicted. I can think of more reasons than just runtime safety for which I'd want proofs. Termination for example, in highly critical code; not something for which a runtime check will suffice. It is true that some properties cannot be verified directly by a runtime check, but that does not mean that runtime checks are not indirectly useful in verifying them. For example, we can check at runtime that a loop variant is strictly decreasing with each iteration. Then, given that each iteration of the loop body terminates, it is guaranteed that the loop terminates, *either* because the runtime check fails, or because the variant goes to zero. In general, we can verify significantly more program properties using a combination of runtime checks and static proof, than we can using static proof alone. That may seem like quite an obvious statement, but the consequence is that any particular property is, in general, not verified purely statically or purely at runtime. I am not opposed to being able to annotate an assertion to say that it should be statically provable and that a runtime check should not be used. However, - such annotations should be very lightweight and visually undistracting, relative to the assertion itself; - a programmer should not interpret such an annotation on a particular assertion to mean that its static validity is not reliant on runtime checks elsewhere; - if the class of assertions that are statically provable changes, then a tool should be provided which can *automatically* add or remove these annotations (with programmer approval when they are removed). I'd like to make a couple more comments about when it is sufficient to detect errors and when it is necessary to prevent them: - If a language supports transactions, then this increases the proportion of cases in which it is sufficient to detect errors in imperative code. When state changes are encapsulated in a transaction, it is much easier to recover if an error is detected, because invariants that were true of objects used by the transaction when it started will be automatically reestablished. (Purely functional code does not need this.) - Almost all safety-critical systems have a recovery or safe shutdown behaviour which should be triggered when an error is detected in the rest of the program. The part of the program that implements this behaviour definitely needs to be statically correct, but it is usually only a small amount of code. Safety-critical systems that must either prevent errors or continue functioning in their presence (aircraft control systems, for example) are in a separate category that demand *much* greater verification effort. Even for these systems, though, it is still useful to detect errors in cases where they cannot be prevented. When multiple independent implementations of a subsystem are used to check each other, this error detection can be used as an input to the decision of which implementation is failing and which should take over. -- David Hopwood [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Joachim Durchholz wrote: Marshall schrieb: Mutability by itself does not imply identity. Well, the implication certainly holds from identity to mutability. The only definition of identity that I found to hold up for all kinds of references (pointers, shared-memory identifiers, URLs etc.) is this: Two pieces of data are identical if and only if: a) they are equal b) they stay equal after applying an arbitrary operation to one of them. This means that for immutable objects, there's no observable difference between equality and identity (which I think it just fine). Agreed on all counts. For the implicaton from mutability to identity, I'm not sure whether talking about mutation still makes sense without some kind of identity. For example, you need to establish that the object after the mutation is still the same in some sense, and this the same concept is exactly identity. Unless we have some specific mechanism to make two named variables have the same identity, (distinct from having the same value), then there is no aliasing. Pointers or references is one such mechanism; lexical closures over variables is another. (I don't know of any others.) I agree that mutability plus identity implies aliasing problems, however. Then we're agreeing about the most important point anyway. Yes. In other words, pointers are essentially just an *aspect* of mutability in lower-level languages. Again, I disagree: it is posible to have mutability without pointers/identity/objects. I'm sceptical. Any examples? See next post. Marshall -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Chris Smith wrote: Joachim Durchholz [EMAIL PROTECTED] wrote: OTOH, isn't that the grail that many people have been searching for: programming by simply declaring the results that they want to see? Possibly. No, FPLs are actually just that: compilable postconditions. This seems to me a bit misleading. Perhaps someone will explain why I should stop thinking this way; but currently I classify statements like this in the well, sort of slot of my mind. If functional programming were really just compilable postconditions, then functional programmers would be able to skip a good bit of stuff that they really can't. For example, time and space complexity of code is still entirely relevant for functional programming. I can't simply write: (define fib (lambda (x) (if ( x 2) 1 (+ (fib (- x 1)) (fib (- x 2)) and expect the compiler to create an efficient algorithm for me. This is true, but note that postconditions also need to be efficient if we are going to execute them. That is, the difference you've pointed out is not a difference between executable postconditions and functional programs. Both the inefficient functional definition of 'fib' and an efficient one are executable postconditions. In order to prove that the efficient implementation is as correct as the inefficient one, we need to prove that, treated as postconditions, the former implies the latter. (In this case a single deterministic result is required, so the former will be equivalent to the latter.) -- David Hopwood [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Joe Marshall wrote: Marshall wrote: Again, I disagree: it is posible to have mutability without pointers/identity/objects. I think you are wrong, but before I make a complete ass out of myself, I have to ask what you mean by `mutability'. (And pointers/identity/objects, for that matter.) Responding to requests for examples from Joachim, Joe, and Chris The very simple example is the one Darren New already mentioned. Consider the following Java fragment: void foo() { int i = 0; int j = 0; // put any code here you want j = 1; i = 2; // check value of j here. It is still 1, no matter what you filled in above. // The assignment to i cannot be made to affect the value of j. } Those two local primitive variables cannot be made to have the same identity. But you can update them, so this is an example of mutability without the possibility of identity. Earlier I also mentioned SQL tables as an example, although SQL supports *explicit* aliasing via views. Alan Bawden discusses the phenomenon of `state' in his Ph.D. dissertation Implementing Distributed Systems Using Linear Naming. MIT AI Lab Technical Report AITR-1627. March 1993 He makes a persuasive argument that `state' is associated with cycles in naming. I would like to read that, but my brain generally runs out of gas at about 21 pages, so it's about an order of magnitude bigger than I can currently handle. :-( As to cycles in naming that's certainly an issue. But it it a requirement for state? Back to Java locals, it seems to me they meet the standard definition of state, despite the lack of cycles. As to pointers/references, I earlier mentioned the existence of the reference/dereference operations as being definitional. Note that one can go to some lengths to obscure them, but they're still there. For example, Java has the reference and dereference operators; Java's . operator is actually C's - operator. I am not so bold/foolish as to attempt a defintion of object however. :-) Marshall -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall wrote: Consider the following Java fragment: void foo() { int i = 0; int j = 0; // put any code here you want j = 1; i = 2; // check value of j here. It is still 1, no matter what you filled in above. // The assignment to i cannot be made to affect the value of j. } True, but you have hidden the pointers. Semantically, the identifiers i and j refer not to integers but to locations that hold integers. The assignment modifies the location. Those two local primitive variables cannot be made to have the same identity. But you can update them, so this is an example of mutability without the possibility of identity. The identity is temporal: You use the same variable name at two different times. Do you intend for the second `i' to mean the same variable as the first `i'? -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Darren New [EMAIL PROTECTED] wrote: Chris Smith wrote: Unless I'm missing your point, I disagree with your disagreement. Mutability only makes sense because of object identity (in the generic sense; no OO going on here). Depends what you mean by object. int x = 6; int y = 5; x = y; I'd say x was mutable, with no identity problems involved? The variable x definitely has identity that's independent of its value. Some might call that a problem in and of itself, as it complicates the formal model of the language and makes it difficult to predict what result will be produced by normal order evaluation. On the other hand, this thread seems to be using identity to mean identity with potential for aliasing, in which case it is vacuously true that eliminating identity also prevents the problems that arise from aliasing. It is true, and I agree on this with Marshall, that eliminating the potential for aliasing solves a lot of problems with checking invariants. I also see, though, that the majority (so far, I'd say all) of the potential uses for which it's worth introducing mutation into an otherwise mutation-free language allow the possibility of aliasing, which sorta makes me wonder whether this problem is worth solving. I'd like to see an example of code that would be harder to write without mutation, but which can obey any restriction that's sufficient to prevent aliasing. Why is it problematic that variables have identity and are mutable? Certainly I can later find whatever value I put into x. I simply found the language confusing. I said it would be nonsensical for a language to have mutation without identity. -- Chris Smith - Lead Software Developer / Technical Trainer MindIQ Corporation -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
David Hopwood [EMAIL PROTECTED] wrote: This is true, but note that postconditions also need to be efficient if we are going to execute them. If checked by execution, yes. In which case, I am trying to get my head around how it's any more true to say that functional languages are compilable postconditions than to say the same of imperative languages. In both cases, some statement is asserted through a description of a means of computing it. There may be a distinction worth making here, but I'm missing it so far. -- Chris Smith - Lead Software Developer / Technical Trainer MindIQ Corporation -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall [EMAIL PROTECTED] wrote: +--- | Joachim Durchholz wrote: | Actually SQL has references - they are called primary keys, but they | are references nevertheless. | | I strongly object; this is quite incorrect. I grant you that from the | 50,000 foot level they appear identical, but they are not. +--- Agreed. The only thing different about primary keys from any other key is uniqueness -- a selection by primary key will return only one record. Other than that constraint, many databases treat them exactly the same as non-primary keys [e.g., can form indexes on them, etc.]. +--- | To qualify as a reference, there need to be reference and dereference | operations on the reference datatype; there is no such operation is SQL. +--- Not in ANSI SQL92, say, but there might be in most SQL databases! [See below re OIDs. Also, SQL:1999 had a REF type that was essentially and OID.] +--- | Would you say the relational algebra has references? +--- Don't confuse SQL relational algebra!! You'll get real relational algebraists *way* bent out of shape if you do that! +--- | (Some SQL dialects also offer synthetic ID fields that are | guaranteed to remain stable over the lifetime of a record. | | Primary keys are updatable; there is nothing special about them. +--- I think he's probably talking about OIDs (object IDs). Most current SQL-based databases provide them, usually as a normally- invisible system column that doesn't show up when you say SELECT * FROM, but that *does* appear if you say SELECT oid, *, and may be used as a primary key even on tables with no actual primary key: rpw3=# select * from toy limit 4; c1 | c2 | c3 | upd +---++- fall | tape | My Favorite Thanksgiving | 16 xmas | book | My Favorite Christmas | 2 xmas | video | The Grinch who Stole Christmas | 4 summer | book | Unusual 4ths of July | 17 (4 rows) rpw3=# select oid, * from toy limit 4; oid | c1 | c2 | c3 | upd ---++---++- 19997 | fall | tape | My Favorite Thanksgiving | 16 19998 | xmas | book | My Favorite Christmas | 2 1 | xmas | video | The Grinch who Stole Christmas | 4 2 | summer | book | Unusual 4ths of July | 17 (4 rows) rpw3=# select * from toy where oid = 19998; c1 | c2 | c3 | upd --+--+---+- xmas | book | My Favorite Christmas | 2 (1 row) rpw3=# insert into toy values ('fall','book','Glory Road'); INSERT 32785 1 rpw3=# select oid, * from toy where oid = 32785; oid | c1 | c2 | c3 | upd ---+--+--++- 32785 | fall | book | Glory Road | 21 (1 row) rpw3=# See http://www.postgresql.org/docs/8.1/static/datatype-oid.html for how PostgreSQL treats OIDs [including some critical limitations]. -Rob - Rob Warnock [EMAIL PROTECTED] 627 26th Avenue URL:http://rpw3.org/ San Mateo, CA 94403 (650)572-2607 -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Joe Marshall wrote: Marshall wrote: Consider the following Java fragment: void foo() { int i = 0; int j = 0; // put any code here you want j = 1; i = 2; // check value of j here. It is still 1, no matter what you filled in above. // The assignment to i cannot be made to affect the value of j. } True, but you have hidden the pointers. Semantically, the identifiers i and j refer not to integers but to locations that hold integers. The assignment modifies the location. What the implementation looks like shouldn't affect how we speak of the logical model. In the above code, there are no pointers. By your definition, pointer and variable are synonyms. That doesn't seem like a good idea to me. (What if i and j end up in registers? I have not heard it said that registers have addresses.) Those two local primitive variables cannot be made to have the same identity. But you can update them, so this is an example of mutability without the possibility of identity. The identity is temporal: You use the same variable name at two different times. Do you intend for the second `i' to mean the same variable as the first `i'? Okay, so i and i have the same identity. I suppose you could argue that the language's namespace is an address-space, and variable names are addresses. At this point, though, you've made the concept of identity so broad that it is now necessarily a property of all languages that use named values, whether updatable or not. I think you would also have to call 3 a void pointer by this scheme. Clearly there is *some* difference between a language which allows explicit pointers and thus aliasing and one that doesn't. What term would you use? First-class variables? Marshall -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Chris Smith wrote: Darren New [EMAIL PROTECTED] wrote: Chris Smith wrote: Unless I'm missing your point, I disagree with your disagreement. Mutability only makes sense because of object identity (in the generic sense; no OO going on here). Depends what you mean by object. int x = 6; int y = 5; x = y; I'd say x was mutable, with no identity problems involved? The variable x definitely has identity that's independent of its value. I'm not sure what you mean by that. I also see, though, that the majority (so far, I'd say all) of the potential uses for which it's worth introducing mutation into an otherwise mutation-free language allow the possibility of aliasing, which sorta makes me wonder whether this problem is worth solving. What about my example of SQL? Mutation, no pointers, no aliasing. Yet: useful. Marshall -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall [EMAIL PROTECTED] wrote: Chris Smith wrote: Darren New [EMAIL PROTECTED] wrote: Chris Smith wrote: Unless I'm missing your point, I disagree with your disagreement. Mutability only makes sense because of object identity (in the generic sense; no OO going on here). Depends what you mean by object. int x = 6; int y = 5; x = y; I'd say x was mutable, with no identity problems involved? The variable x definitely has identity that's independent of its value. I'm not sure what you mean by that. I mean, simply, that when you can assign a value to a variable, then you care that it is that variable and not a different one. That's identity in the normal sense of the word. The code elsewhere in the procedure is able to access the value of 'x', and that has meaning even though you don't know what value 'x' has. This has definite implications, and is a useful concept; for example, it means that the pure lambda calculus no longer sufficient to express the semantics of the programming language, but instead something else is required. What you are asking for is some subset of identity, and I've not yet succeeded in understanding exactly what it is or what its limits are... except that so far, it seems to have everything to do with pointers or aliasing. I also see, though, that the majority (so far, I'd say all) of the potential uses for which it's worth introducing mutation into an otherwise mutation-free language allow the possibility of aliasing, which sorta makes me wonder whether this problem is worth solving. What about my example of SQL? Mutation, no pointers, no aliasing. Yet: useful. I'm not yet convinced that this is any different from a language with standard pointer aliasing. If I have two tables A and B, and a foreign key from A into B, then I run into the same problems with enforcing constraints that I would see with a pointer model... when I update a relation, I need to potentially check every other relation that contains a foreign key into it, in order to ensure that its constraints are not violated by that constraint. That's the same thing that is being pointed out as a negative consequence of aliasing in other languages. For example, executing: UPDATE P SET x = 5 WHERE y = 43; may result in the database having to re-evaluate the constraint that says that for all P(x, y, z), x must be less than 4 when z = 17. One difference is that while in general purpose programming languages this appears to be a daunting task, databases are set up to do these kinds of things all the time and contain optimizations for it... but the problem is still the same, and it would still present the same difficulties in doing formal proofs that running the above UPDATE statement doesn't violate any invariants. (If I'm wrong about that, please let me know; I'd very interested if that's so.) -- Chris Smith - Lead Software Developer / Technical Trainer MindIQ Corporation -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Chris Smith wrote: David Hopwood [EMAIL PROTECTED] wrote: This is true, but note that postconditions also need to be efficient if we are going to execute them. If checked by execution, yes. In which case, I am trying to get my head around how it's any more true to say that functional languages are compilable postconditions than to say the same of imperative languages. A postcondition must, by definition [*], be a (pure) assertion about the values that relevant variables will take after the execution of a subprogram. If a subprogram is intended to have side effects, then its postcondition can describe the results of the side effects, but must not reexecute them. [*] E.g. see http://www.spatial.maine.edu/~worboys/processes/hoare%20axiomatic.pdf, although the term postcondition was introduced later than this paper. -- David Hopwood [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Chris Smith wrote: Marshall [EMAIL PROTECTED] wrote: Chris Smith wrote: Darren New [EMAIL PROTECTED] wrote: Chris Smith wrote: Unless I'm missing your point, I disagree with your disagreement. Mutability only makes sense because of object identity (in the generic sense; no OO going on here). Depends what you mean by object. int x = 6; int y = 5; x = y; I'd say x was mutable, with no identity problems involved? The variable x definitely has identity that's independent of its value. I'm not sure what you mean by that. I mean, simply, that when you can assign a value to a variable, then you care that it is that variable and not a different one. That's identity in the normal sense of the word. I guess it is, now that you mention it. The code elsewhere in the procedure is able to access the value of 'x', and that has meaning even though you don't know what value 'x' has. This has definite implications, and is a useful concept; for example, it means that the pure lambda calculus no longer sufficient to express the semantics of the programming language, but instead something else is required. What you are asking for is some subset of identity, and I've not yet succeeded in understanding exactly what it is or what its limits are... except that so far, it seems to have everything to do with pointers or aliasing. Perhaps it is specifically first-class identity, rather than identity per se. I also see, though, that the majority (so far, I'd say all) of the potential uses for which it's worth introducing mutation into an otherwise mutation-free language allow the possibility of aliasing, which sorta makes me wonder whether this problem is worth solving. What about my example of SQL? Mutation, no pointers, no aliasing. Yet: useful. I'm not yet convinced that this is any different from a language with standard pointer aliasing. If I have two tables A and B, and a foreign key from A into B, then I run into the same problems with enforcing constraints that I would see with a pointer model... when I update a relation, I need to potentially check every other relation that contains a foreign key into it, in order to ensure that its constraints are not violated by that constraint. That's the same thing that is being pointed out as a negative consequence of aliasing in other languages. No, that's not the same thing. What you are describing here is not an aliasing issue, but simply the consequences of allowing constraints to mention more than one variable. In our i-and-j example above, suppose there was a constraint such that i j. We have to re-check this constraint if we update either i or j. That's not the same thing as saying that i and j are aliased. A foreign key constraint is a multi-variable constraint. Specifically, a foreign key from table A, attribute a to table B, attribute b is the constraint: forall a in A, exists b in B such that a = b. Note that two variables, A and B, are referenced in the constraint. In general, any constraint on two variables will have to be rechecked upon update to either. For example, executing: UPDATE P SET x = 5 WHERE y = 43; may result in the database having to re-evaluate the constraint that says that for all P(x, y, z), x must be less than 4 when z = 17. I don't see any aliasing in this example either. But consider how much worse this problem is if real aliasing is possible. We have some pointer variable, and initialize it with the return value from some function. We don't know anything about what variable is involved. We update through the pointer. What constraints must we recheck? Apparently all of them; unless we have perfect alias analysis, we can't tell what variables are affected by our update. One difference is that while in general purpose programming languages this appears to be a daunting task, databases are set up to do these kinds of things all the time and contain optimizations for it... but the problem is still the same, and it would still present the same difficulties in doing formal proofs that running the above UPDATE statement doesn't violate any invariants. (If I'm wrong about that, please let me know; I'd very interested if that's so.) I'm interested to hear your reply. Marshall -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall [EMAIL PROTECTED] wrote: What you are asking for is some subset of identity, and I've not yet succeeded in understanding exactly what it is or what its limits are... except that so far, it seems to have everything to do with pointers or aliasing. Perhaps it is specifically first-class identity, rather than identity per se. As in: the value of one variable can (be/refer to/depend on) the identity of another variable? I can certainly see this as as reasonable concept to consider. I'm not yet convinced that this is any different from a language with standard pointer aliasing. If I have two tables A and B, and a foreign key from A into B, then I run into the same problems with enforcing constraints that I would see with a pointer model... when I update a relation, I need to potentially check every other relation that contains a foreign key into it, in order to ensure that its constraints are not violated by that constraint. That's the same thing that is being pointed out as a negative consequence of aliasing in other languages. No, that's not the same thing. What you are describing here is not an aliasing issue, but simply the consequences of allowing constraints to mention more than one variable. A foreign key constraint is a multi-variable constraint. Specifically, a foreign key from table A, attribute a to table B, attribute b is the constraint: forall a in A, exists b in B such that a = b. Note that two variables, A and B, are referenced in the constraint. There's confusion here coming from different usages of the word variable. Let us talk instead of values, and of the abstract structures that gives them meaning. In both cases (invariants in a hypothetical imperative language, and in a relational database), the constraints make reference to these structures of values (relations, for example, or various kinds of data structures), and not to the individual values or objects that they contain. In both cases, the problem is not that we don't know what structures to check to verify the invariant; rather, it's that we have to check ALL of the values in that structure. As someone pointed out, this is to be expected in a world of mutable things with identity that are globally locatable. It is simple fact that if I tell you I spoke to Barbara's husband, you may need to trace down who Barbara's husband is before you could discover that, for example, maybe I actually spoke to your boss, or to your nephew's best- friend's father. If databases are capable of modeling these kinds of relationships (and of course they are), then they are as susceptible to aliasing -- in a logical sense that avoids mention of pointer -- as anyone else. I don't see any aliasing in this example either. Actually, this was probably a bad example. Let's stick to the others involving relationships between tuples. -- Chris Smith - Lead Software Developer / Technical Trainer MindIQ Corporation -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
David Hopwood wrote: George Neuner wrote: All of this presupposes that you have a high level of confidence in the compiler. I've been in software development for going in 20 years now and worked 10 years on high performance, high availability systems. In all that time I have yet to meet a compiler ... or significant program of any kind ... that is without bugs, noticeable or not. [...] One of the main reasons for this, IMHO, is that many compilers place too much emphasis on low-level optimizations of dubious merit. For C and Java, I've taken to compiling all non-performance-critical code without optimizations, and that has significantly reduced the number of compiler bugs that I see. It has very little effect on overall execution performance (and compile times are quicker). I don't think that over-complex type systems are the cause of more than a small part of the compiler bug problem. In my estimation, the frequency of bugs in different compilers *for the same language* can vary by an order of magnitude. Also, the use of typed intermediate languages within the compiler might actually help drastically cutting down on the more severe problem of code transformation bugs, notwithstanding the relative complexity of suitable internal type systems. - Andreas -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall schrieb: Joachim Durchholz wrote: Marshall schrieb: Now, I'm not fully up to speed on DBC. The contract specifications, these are specified statically, but checked dynamically, is that right? That's how it's done in Eiffel, yes. In other words, we can consider contracts in light of inheritance, but the actual verification and checking happens at runtime, yes? Sure. Though, while DbC gives rules for inheritance (actually subtypes), these are irrelevant to the current discussion; DbC-minus-subtyping can still be usefully applied. Yes, subtyping. Of course I meant to say subtyping.blush No need to blush for that, it's perfectly OK in the Eiffel context, where a subclass ist always assumed to be a subtype. (This assumption isn't always true, which is why Eiffel has some serious holes in the type system. The DbC ideas are still useful though, saying subtype instead of subclass just makes the concept applicable outside of OO languages.) I can certainly see how DbC would be useful without subtyping. But would there still be a reason to separate preconditions from postconditions? I've never been clear on the point of differentiating them (beyond the fact that one's covariant and the other is contravariant.) There is indeed. The rules about covariance and contravariance are just consequences of the notion of having a subtype (albeit important ones when it comes to designing concrete interfaces). For example, if a precondition fails, something is wrong about the things that the subroutine assumes about its environment, so it shouldn't have been called. This means the error is in the caller, not in the subroutine that carries the precondition. The less important consequence is that this should be reflected in the error messages. The more important consequences apply when integrating software. If you have a well-tested library, it makes sense to switch off postcondition checking for the library routines, but not their precondition checking. This applies not just for run-time checking: Ideally, with compile-time inference, all postconditions can be inferred from the function's preconditions and their code. The results of that inference can easily be stored in the precompiled libraries. Preconditions, on the other hand, can only be verified by checking all places where the functions are called. Wouldn't it be possible to do them at compile time? (Although this raises decidability issues.) Exactly, and that's why you'd either uses a restricted assertion language (and essentially get something that's somewhere between a type system and traditional assertion); or you'd use some inference system and try to help it along (not a simple thing either - the components of such a system exist, but I'm not aware of any system that was designed for the average programmer). As to the average programmer, I heard this recently on comp.databases.theory: Don't blame me for the fact that competent programming, as I view it as an intellectual possibility, will be too difficult for the average programmer -you must not fall into the trap of rejecting a surgical technique because it is beyond the capabilities of the barber in his shop around the corner. -- EWD512 Given the fact that we have far more need for competently-written programs than for competent surgery, I don't think that we'll be able to follow that idea. Mightn't it also be possible to leave it up to the programmer whether a given contract was compile-time or runtime? I'd agree with that, but I'm not sure how well that would hold up in practice. I want to try it and see what it's like. So do I :-) Regards, Jo -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Darren New schrieb: As far as I understand it, Eiffel compilers don't even make use of postconditions to optimize code or eliminate run-time checks (like null pointer testing). That's correct. I think a large part of the reasons why this isn't done is that Eiffel's semantics is (a) too complicated (it's an imperative language after all), and (b) not formalized, which makes it difficult to assess what optimizations are safe and what aren't. (Reason (c) is that Eiffel compiler vendors don't have the manpower for this kind of work, mostly in quantity but also, to some degree, in quality: people with a solid language semantics background tend to be repelled by the language inventor's know-it-all deny-the-problems don't-bother-me-with-formalisms attitude. He still has moved the state of the art ahead - mostly by pointing out a certain class of problems in OO designs and explaining them lucidly, and proposing solutions that hold up better than average even if still fundamentally flawed.) Regards, Jo -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Joachim Durchholz wrote: Darren New schrieb: As far as I understand it, Eiffel compilers don't even make use of postconditions to optimize code or eliminate run-time checks (like null pointer testing). That's correct. I think a large part of the reasons why this isn't done is that Eiffel's semantics is (a) too complicated (it's an imperative language after all), and (b) not formalized, which makes it difficult to assess what optimizations are safe and what aren't. I can see the lack of a formal model being an issue, but is the imperative bit really all that much of an obstacle? How hard is it really to deal with assignment? Or does the issue have more to do with pointers, aliasing, etc.? Marshall -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Joachim Durchholz wrote: Marshall schrieb: I can certainly see how DbC would be useful without subtyping. But would there still be a reason to separate preconditions from postconditions? I've never been clear on the point of differentiating them (beyond the fact that one's covariant and the other is contravariant.) There is indeed. The rules about covariance and contravariance are just consequences of the notion of having a subtype (albeit important ones when it comes to designing concrete interfaces). For example, if a precondition fails, something is wrong about the things that the subroutine assumes about its environment, so it shouldn't have been called. This means the error is in the caller, not in the subroutine that carries the precondition. The less important consequence is that this should be reflected in the error messages. The more important consequences apply when integrating software. If you have a well-tested library, it makes sense to switch off postcondition checking for the library routines, but not their precondition checking. This applies not just for run-time checking: Ideally, with compile-time inference, all postconditions can be inferred from the function's preconditions and their code. The results of that inference can easily be stored in the precompiled libraries. Preconditions, on the other hand, can only be verified by checking all places where the functions are called. Interesting! So, what exactly separates a precondition from a postcondition from an invariant? I have always imagined that one writes assertions on parameters and return values, and those assertions that only reference parameters were preconditions and those which also referenced return values were postconditions. Wouldn't that be easy enough to determine automatically? And what's an invariant anyway? Marshall -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall wrote: So, what exactly separates a precondition from a postcondition from an invariant? A precondition applies to a routine/method/function and states the conditions under which a function might be called. For example, a precondition on stack.pop might be not stack.empty, and socket.read might have a precondition of socket.is_open, and socket.open_a_new_connection might have a precondition of socket.is_closed. A postcondition applies to a routine/method/function and states (at least part of) what that routine guarantees to be true when it returns, assuming it is called with true preconditions. So not stack.empty would be a postcondition of stack.push. If your postconditions are complete, they tell you what the routine does. An invariant is something that applies to an entire object instance, any time after the constructor has completed and no routine within that instance is running (i.e., you're not in the middle of a call). There should probably be something in there about destructors, too. So an invariant might be stack.is_empty == (stack.size == 0) or perhaps socket.is_open implies (socket.buffer != null) or some such. The first problem with many of these sorts of things are that in practice, there are lots of postconditions that are difficult to express in any machine-readable way. The postcondition of socket.read is that the buffer passed as the first argument has valid data in the first n bytes, where n is the return value from socket.read, unless socket.read returned -1. What does valid mean there? It has to match the bytes sent by some other process, possibly on another machine, disregarding the bytes already read. You can see how this can be hard to formalize in any particularly useful way, unless you wind up modelling the whole universe, which is what most of the really formalized network description techniques do. Plus, of course, without having preconditions and postconditions for OS calls, it's difficult to do much formally with that sort of thing. You can imagine all sorts of I/O that would be difficult to describe, even for things that are all in memory: what would be the postcondition for screen.draw_polygon? Any set of postconditions on that routine that don't mention that a polygon is now visible on the screen would be difficult to take seriously. There are also problems with the complexity of things. Imagine a chess-playing game trying to describe the generate moves routine. Precondition: An input board with a valid configuration of chess pieces. Postcondition: An array of boards with possible next moves for the selected team. Heck, if you could write those as assertions, you wouldn't need the code. -- Darren New / San Diego, CA, USA (PST) This octopus isn't tasty. Too many tentacles, not enough chops. -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall schrieb: I can see the lack of a formal model being an issue, but is the imperative bit really all that much of an obstacle? How hard is it really to deal with assignment? Or does the issue have more to do with pointers, aliasing, etc.? Actually aliasing is *the* hard issue. Just one data point: C compiler designers spend substantial effort to determine which data structures cannot have aliasing to be able to apply optimizations. This can bite even with runtime assertion checking. For example, ordinarily one would think that if there's only a fixed set of routines that may modify some data structure A, checking the invariants of that structure need only be done at the exit of these routines. Now assume that A has a reference to structure B, and that the invariants involve B in some way. (E.g. A.count = A-B.count.) There's still no problem with that - until you have a routine that exports a reference to B, which gets accessed from elsewhere, say via C. Now if I do C-B.count = 27 I will most likely break A's invariant. If that assignment is in some code that's far away from the code for A, debugging can become exceedingly hard: the inconsistency (i.e. A violating its invariant) can live for the entire runtime of the program. So that innocent optimization don't check all the program's invariants after every assignment immediately breaks with assignment (unless you do some aliasing analysis). In an OO language, it's even more difficult to establish that some data structure cannot be aliased: even if the code for A doesn't hand out a reference to B, there's no guarantee that some subclass of A doesn't. I.e. the aliasing analysis has to be repeated whenever there's a new subclass, which may happen at link time when you'd ordinarily don't want to do any serious semantic analysis anymore. There's another way around getting destroyed invariants reported at the time the breakage occurs: lock any data field that goes into an invariant (or a postcondition, too). The rationale behind this is that from the perspective of assertions, an alias walks, smells and quacks just like concurrent access, so locking would be the appropriate answer. The problem here is that this means that updates can induce *very* expensive machinery - imagine an invariant that says A-balance is the sum of all the transaction-amount fields in the transaction list of A, it would cause all these -amount fields to be locked as soon as a transaction list is hooked up with the amount. OTOH one could argue that such a locking simply makes cost in terms of debugging time visible as runtime overhead, which isn't entirely a Bad Thing. There are all other kinds of things that break in the presence of aliasing; this is just the one that I happen to know best :-) Without mutation, such breakage cannot occur. Invariants are just the common postconditions of all the routines that may construct a structure of a given type. Regards, Jo -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall schrieb: So, what exactly separates a precondition from a postcondition from an invariant? I have always imagined that one writes assertions on parameters and return values, and those assertions that only reference parameters were preconditions and those which also referenced return values were postconditions. Wouldn't that be easy enough to determine automatically? Sure. Actually this can be done in an even simpler fashion; e.g. in Eiffel, it is (forgive me if I got the details wrong, it's been several years since I last coded in it): foo (params): result_type is require -- list of assertions; these become preconditions do -- subroutine body ensure -- list of assertions; these become postconditions end And what's an invariant anyway? In general, it's a postcondition to all routines that create or modify a data structure of a given type. Eiffel does an additional check at routine entry, but that's just a last-ditch line of defense against invariant destruction via aliases, not a conceptual thing. Keep aliases under control via modes (i.e. unaliasable marks on local data to prevent aliases from leaving the scope of the data structure), or via locking, or simply by disallowing destructive updates, and you don't need the checks at routine entry anymore. Regards, Jo -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Joachim Durchholz wrote: Marshall schrieb: I can see the lack of a formal model being an issue, but is the imperative bit really all that much of an obstacle? How hard is it really to deal with assignment? Or does the issue have more to do with pointers, aliasing, etc.? Actually aliasing is *the* hard issue. Okay, sure. Nice explanation. But one minor point: you describe this as an issue with imperative languages. But aliasing is a problem associated with pointers, not with assignment. One can have assignment, or other forms of destructive update, without pointers; they are not part of the definition of imperative. (Likewise, one can have pointers without assignment, although I'm less clear if the aliasing issue is as severe.) Marshall -- http://mail.python.org/mailman/listinfo/python-list
Re: What is a type error?
Marshall schrieb: Joachim Durchholz wrote: Marshall schrieb: I can see the lack of a formal model being an issue, but is the imperative bit really all that much of an obstacle? How hard is it really to deal with assignment? Or does the issue have more to do with pointers, aliasing, etc.? Actually aliasing is *the* hard issue. Okay, sure. Nice explanation. But one minor point: you describe this as an issue with imperative languages. But aliasing is a problem associated with pointers, not with assignment. Aliasing is not a problem if the aliased data is immutable. One can have assignment, or other forms of destructive update, without pointers; they are not part of the definition of imperative. Sure. You can have either of destructive updates and pointers without incurring aliasing problems. As soon as they are combined, there's trouble. Functional programming languages often drop assignment entirely. (This is less inefficient than one would think. If everything is immutable, you can freely share data structures and avoid some copying, and you can share across abstraction barriers. In programs with mutable values, programmers are forced to choose the lesser evil of either copying entire data structures or doing a cross-abstraction analysis of who updates what elements of what data structure. A concrete example: the first thing that Windows does when accepting userland data structures is... to copy them; this were unnecessary if the structures were immutable.) Some functional languages restrict assignment so that there can exist at most a single reference to any mutable data structure. That way, there's still no aliasing problems, but you can still update in place where it's really, really necessary. I know of no professional language that doesn't have references of some kind. Regards, Jo -- http://mail.python.org/mailman/listinfo/python-list