On Monday 20 December 2010 01:52:58 spir wrote: > On Mon, 20 Dec 2010 01:29:13 -0800 > > Jonathan M Davis <jmdavisp...@gmx.com> wrote: > > > For me, the important difference is that classes are referenced, while > > > structs are plain values. This is a semantic distinction of highest > > > importance. I would like structs to be subtype-able and to implement > > > (runtime-type-based) polymorphism. > > > > Except that contradicts the facts that they're value types. You can't > > have a type which has polymorphism and is a value type. By its very > > nature, polymorphism requires you to deal with a reference. > > Can you expand on this? > > At least Oberon has value structs ("records") with inheritance and > polyporphism; I guess the turbo Pascal OO model was of that kind, too > (unsure) -- at least the version implemented in freepascal seems to work > fine that way. And probably loads of less known PLs provide such a > feature. D structs could as well IIUC: I do not see the relation with > instances beeing implicitely referenced. (Except that they must be passed > by ref to "member functions" they are the receiver of, but this is true > for any kind of OO, including present D structs.) > > (I guess we have very different notions of "reference", as shown by > previous threads.)
Okay. This can get pretty complicated, so I'm likely to screw up on some of the details, but this should give you a basic idea of what's going on. In essentially any C-based language, when you declare an integer on the stack like so: int a = 2; you set aside a portion of the stack which is the exact size of an int (typically 32 bits, but that will depend on the language). If you declare a pointer, int* a; then you're setting aside a portion of the stack the size of a pointer (32 bits on a 32 bit machine and 64 bits on a 64 bit machine). That variable then holds an address - typically to somewhere on the heap, though it could be to an address on the stack somewhere. In the case of int*, the address pointed to will refer to a 32-bit block of memory which holds an int. If you have a struct or a class that you put on the stack. Say, class A { int a; float b; } then you're setting aside exactly as much space as that type requires to hold itself. At minimum, that will be the total size of its member variables (in this case an int and a float, so probably a total of 64 bits), but it often will include extra padding to align the variables along appropriate boundaries for the sake of efficiency, and depending on the language, it could have extra type information. If the class has a virtual table (which it will if it has virtual functions, which in most any language other than C++ would mean that it definitely has a virtual table), then that would be part of the space required for the class as well (virtual functions are polymorphic; when you call a virtual function, it calls the version of the function for the actual type that an object is rather than the pointer or reference that you're using to refer to the object; when a non-virtual function function is called, then the version of the function which the pointer or reference is is used; all class functions are virtual in D unless the compiler determines that they don't have to be and optimizes it out (typically because they're final); struct functions and stand- alone functions are never virtual). The exact memory layout of a type _must_ be known at compile time. The exact amount of space required is then known, so that the stack layout can be done appropriately. If you're dealing with a pointer, then the exact memory layout of the memory being pointed to needs to be known when that memory is initialized, but the pointer doesn't necessarily need to know it. This means that you can have a pointer of one type point to a variable of another type. Now, assuming that you're not subverting the type system (e.g. my casting int* to float*), you're dealing with inheritance. For instance, you have class B : A { bool c; } and a variable of type A*. That pointer could point to an object which is exactly of type A, or it could point to any subtype of A. B is derived from A, so the object could be a B. As long as the functions are virtual, you can have polymorphic functions by having the virtual table used to call the version of the function for the type that the object actually is rather than the type that the pointer is. References are essentially the same as pointers (though they may have some extra information with them, making them a bit bigger than a pointer would be in terms of the amount of space required on the stack). However, in the case of D, pointers are _not_ treated as polymorphic (regardless of whether a function is virtual or not), whereas references _are_ treated as polymorphic (why, I don't know - probably to simplify pointers). In C++ though, pointers are polymorphic. Now, if you have a variable of type A*, you could do something like this: B* b = new B(); A* a = b; A* takes up 32 or 64 bits in memory and holds the memory location on the heap where the B object is. Both pointers have the same value and point to the same object. The only difference is how the compiler treats each type (e.g. you can't call a B function on the a variable). Calling A functions on the a variable will call the B version if it has its own version and the function is virtual. However, what about this: B b; A a = b; The memory layout of b and a must be known at compile time. They're laid out precisely on the stack. b has the size of a B object. a has the size of an A object. a is _exactly_ an A. It cannot be a B. So, what you get is called sheering. The A portions of the variable are assigned (in this case, the int and the float), whereas the B portions aren't assigned. a is now exactly as it would have been had you created it with its member variables having the same values that b's member variables from its A portion had. This is almost certainly _not_ what you wanted. Now, because a is exactly an A, and b is exactly a B, when you go to call functions on them, it doesn't matter whether they're virtual or not. The type of the variable _is_ the type of the object. There is no polymorphism. You _need_ that level of indirection to get it. Now, you could conceivably have a language where all of its objects were actually pointers, but they were treated as value types. So, B b; A a = b; would actually be declaring B* b; A* a = b; underneath the hood, except that the assignment would do a deep copy and allocate the appropriate meemory rather than just copying the pointer like would happen in a language like C++ or D. Perhaps that's what Oberon does. I have no idea. I have never heard of the language before, let alone used it. However, that's _not_ how C++, D, C#, or Java works. If you declare B b; A a = b; then you are literally putting a B and an A on the stack, and assignments from a B to an A will cause sheering. D chose to avoid the sheering issue by making structs not have inheritance. This also means that they don't have a virtual table, which makes them more efficient. Classes have inheritance and a virtual table, but because they're on the heap, you don't get sheering and polymorphism works just fine. So, what it comes down to is that you can't have polymorphism for a stack object because you know _exactly_ what its type is, and you can't have inheritance for a stack object without risking sheering when assignments are made (unless you disallow assignments from one type of object to another unless they're the exact same type). So, you're never going to see inheritance for structs in D. It doesn't fit its memory model at all. What you get instead are templates, which can be used to generate the same code for different types. And that's as close as you're going to get for polymorphism for structs. - Jonathan M Davis