There's a lot of confusion on php-internals about a proposal for type hinting, so someone did a writeup on related terminology, including something called "popcorn WTF".
Wietse ----- Forwarded message from Martin Alterisio ----- Message-ID: <[EMAIL PROTECTED]> Date: Sun, 6 Jan 2008 23:58:54 -0300 From: "Martin Alterisio" <[EMAIL PROTECTED]> To: "PHP Developers Mailing List" <internals@lists.php.net> Subject: [PHP-DEV] type hinting A friend told me you were having a most interesting debate on type hinting in the internals, when I got some free time I checked the newsgroup to see how was it coming. It's quite interesting and many good points have been made already. But it is quite difficult to understand some concepts that some members of the community are trying to convey. It seems that this is caused by the lack of an agreement on common terminology. I'd like to review some of the terms used to the best of my capabilities, so to help further debate establishing a common ground of understanding. I left some side notes (marked with an *) that should be regarded as general opinion and considered as subjective points of view. - Static typing Also known as "early type binding" and "variable type binding". On static typing, type checking is done at compile time, and type information is bound to the variable (a variable may not change its type). - Dynamic typing Also known as "late type binding" and "value type binding". On dynamic typing, type checking is done at runtime, and type information is bound to the value (a variable may change its type along its value). * Performance considerations regarding static vs dynamic typing Dynamic typing is generally considered to have better performance in interpreted languages. Static typing requires to examine the whole system for type checks, therefore the compile is more complex and takes much more time to complete. Since interpreters run the compile and the execution, the overhead caused by static typing can be quite considerable. Nevertheless, this is not completely true. The C/C++ language has overcome this issue by dividing code units into declaration (header files) and definition (the actual code). Code units just include declaration of external names and the compile time is greatly improved, as the compile/link cycle allows to only recompile the code units that were affected by changes. The Java language has overcome this issue by enforcing an organization model and code structure, and because the Java Machine object code retains name signatures (so, no need for a linker). Static typing is generally considered to have better performance in compiled languages. Dynamic typing requires type checks before most operations. This checks make the execution much slower. Once compiled, what only matters to a program is its performance on execution, so static typing seems to be a better choice. As it did happen with performance issues with static typing, it might prove (or rather be already proven) that this is not entirely true. It has been theoretically and practically shown that a dynamic typed language may reduce the performance impact by doing code analysis or having the user indicate expected types. Knowing the type that should be expected, the compiler/interpreter could prepare a better implementation of the code unit to be used with only type checking on input data. - Type conversion Type conversion occurs when a value of certain type is converted to a value of another type. There are two types of type conversion: type casting and type coercion. - Type casting Type casting is an explicit type conversion. The user explicitly dictates that a value should be converted to a type. - Type coercion Also known as "type juggling" in PHP-dom. Type coercion is an implicit type conversion. The language tries to find a way to convert a value to meet its destination type. * General considerations regarding type conversion It has been usually considered that type coercion is harmful and unsafe, which is very prone to raise what has been generally referred to in this list as the "WTF factor" (expected behavior that raises reasonable doubts as to "who could really expect that to happen?", and it's pre-announced with a WTF onomatopoeia. Nevertheless, modern language trends indicate that type coercion has become of rather wide spread use. C++ had type coercion for a very long time (can we call C++ "modern language"?), many scripting languages have type coercion, and even Java has introduced type coercion for its native types and wrappers (what's called autoboxing and unboxing in java-dom). - Type inference A variant of static typing where types are left out undeclared or as type parameters. The actual types are completed on the use of the code, filling out the blanks with the types needed. This is a common feature of functional languages, and is also known to be used in imperative languages with what's called "generic programming". - Generic programming (since it's mentioned before but it really doesn't matter to this debate) Also known as "template programming". Generic programming is a programming technique where code its programatically generated on demand from code templates, where templates parameters are filled in from the needs of the actual code. It can be supported by the language or implemented as an external precompiler (or both). PHP inadvertently supports some form of generic programming. It's usually confused with parametric typing. Although generic programming may allow some form of parametric typing, it's not the same concept and should not be confused. - Parametric typing (doesn't matter either, but mentioned before, so for clarification) Parametric typing allows the definition of parametrized types. Parametrized types are types that are defined on other types. The most known parametrized type in computer science is the array (not the array from dynamic-typing-dom at least). The computer science "canon" considers the array to be parametrized to the type of value it can hold, so we have an array of ints, an array of strings, an array of X, etc, etc. Since dynamic typing bounds the type to the value, arrays in dynamic typing can hold any type, so it's not, strictly speaking, a parametrized type. - Typing strength Typing strength is the measurement of how much the language forces you to work with types. How this measurement is done is a matter of large discussion. - Strong typing There's a general acceptance that "strong typing" refers to the use of static typing and the lack of type coercion and type inference. A more radical view on "strong typing" dictates that type conversion should be completely disallowed. We can refer to those conditions as "very strong typing". - Weak typing Weak typing is the opposite of strong typing. Generally speaking, if a language has dynamic typing, type coercion, or type inference, it's usually considered to be weakly typed. The degree of this weakness varies with how much it allows to circumvent the use of types. - Type safety Type safety are the mechanisms provided to prevent errors caused by type misuse, commonly referred as "type errors". - Type signature Type signature is the expected types on input and output data of an algorithm. It's generally used to refer to the types of the arguments and the return value of a function or method. The language must guarantee type safety by ensuring that the type signature is obeyed. - Type hinting Also known as "type annotations", but discouraged since that terms can be confused with the type declarations of static typing. It's a relatively new concept, there's not much literature on the subject. A source of definitions on type hinting are the ECMAScript4 proposals, were this feature is referred to as "type annotations". Type hinting is the optional use of static typing checks in a dynamic typing environment. It's meant to provide an optional increase of type safety, to those who are used to static typing, in a scripting environment that traditionally uses dynamic typing. * Performance considerations on type signature vs type hinting These considerations are only relevant on a dynamic typing environment. Type signature only demands checks to be done at entry point and exit point, so less checks are usually done. Type hinting demands that the type is checked wherever necessary to assure the type constraint is followed, as if static typing was used. Therefore type hinting is much more time consuming due to type checks. Nevertheless, type checks can be avoided if enough type hints are known at compile time, and, therefore, type checks can be avoided. Moreover, when these conditions are met, the compiler could theoretically profit from the performance benefits inherent to static typing. ---- Now, some random facts ---- % You should have guessed by now the irony of C being strongly typed and C++ being weakly typed. Herein lies most of the reasons for discussion on what the hell is wrong with the scale that measures the typing strength... (I believe is not the scale, just the fact that it seems hard to recognize C++ as completely different language) % Yeap, strictly speaking, PHP doesn't support type hinting, just type signature. % PHP uses type coercion for native types, but silently ignores type conversion errors = random WTF % If you want to type signature with native types in PHP, you'll have to do type coercion (the array type hint should do it too). If not, you're just doing strong typing demands on an strong-typing-unfriendly environment. % PHP type coercion + type coercion on type signatures = popcorn WTF % Type coercion on type signature and passing by reference do not coexist nicely. A challenge for the daring to solve. % A full implementation of type hinting is much of a challenge and better left to a major version release. Some opcode mojo and backpatching voodoo should be concocted to get the juice out of this fruit. ---- I might sound like a broken record by now, but, sorry for my never-ending intrusions. I'm just too much of a geek to resist the temptation of meddling into a CS related debate. Best Regards, Your friendly neighborhood wannabe game developer, Martin Alterisio ----- End of forwarded message from Martin Alterisio ----- -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php