There's a lot of confusion on php-internals about a proposal for
type hinting, so someone did a writeup on related terminology,
including something called "popcorn WTF".

        Wietse

----- Forwarded message from Martin Alterisio -----

Message-ID: <[EMAIL PROTECTED]>
Date: Sun, 6 Jan 2008 23:58:54 -0300
From: "Martin Alterisio" <[EMAIL PROTECTED]>
To: "PHP Developers Mailing List" <internals@lists.php.net>
Subject: [PHP-DEV] type hinting

A friend told me you were having a most interesting debate on type hinting
in the internals, when I got some free time I checked the newsgroup to see
how was it coming. It's quite interesting and many good points have been
made already. But it is quite difficult to understand some concepts that
some members of the community are trying to convey. It seems that this is
caused by the lack of an agreement on common terminology. I'd like to review
some of the terms used to the best of my capabilities, so to help further
debate establishing a common ground of understanding.

I left some side notes (marked with an *) that should be regarded as general
opinion and considered as subjective points of view.

- Static typing

Also known as "early type binding" and "variable type binding".
On static typing, type checking is done at compile time, and type
information is bound to the variable (a variable may not change its type).

- Dynamic typing

Also known as "late type binding" and "value type binding".
On dynamic typing, type checking is done at runtime, and type information is
bound to the value (a variable may change its type along its value).

* Performance considerations regarding static vs dynamic typing

Dynamic typing is generally considered to have better performance in
interpreted languages. Static typing requires to examine the whole system
for type checks, therefore the compile is more complex and takes much more
time to complete. Since interpreters run the compile and the execution, the
overhead caused by static typing can be quite considerable. Nevertheless,
this is not completely true. The C/C++ language has overcome this issue by
dividing code units into declaration (header files) and definition (the
actual code). Code units just include declaration of external names and the
compile time is greatly improved, as the compile/link cycle allows to only
recompile the code units that were affected by changes. The Java language
has overcome this issue by enforcing an organization model and code
structure, and because the Java Machine object code retains name signatures
(so, no need for a linker).

Static typing is generally considered to have better performance in compiled
languages. Dynamic typing requires type checks before most operations. This
checks make the execution much slower. Once compiled, what only matters to a
program is its performance on execution, so static typing seems to be a
better choice. As it did happen with performance issues with static typing,
it might prove (or rather be already proven) that this is not entirely true.
It has been theoretically and practically shown that a dynamic typed
language may reduce the performance impact by doing code analysis or having
the user indicate expected types. Knowing the type that should be expected,
the compiler/interpreter could prepare a better implementation of the code
unit to be used with only type checking on input data.

- Type conversion

Type conversion occurs when a value of certain type is converted to a value
of another type.
There are two types of type conversion: type casting and type coercion.

- Type casting

Type casting is an explicit type conversion.
The user explicitly dictates that a value should be converted to a type.

- Type coercion

Also known as "type juggling" in PHP-dom.
Type coercion is an implicit type conversion.
The language tries to find a way to convert a value to meet its destination
type.

* General considerations regarding type conversion

It has been usually considered that type coercion is harmful and unsafe,
which is very prone to raise what has been generally referred to in this
list as the "WTF factor" (expected behavior that raises reasonable doubts as
to "who could really expect that to happen?", and it's pre-announced with a
WTF onomatopoeia. Nevertheless, modern language trends indicate that type
coercion has become of rather wide spread use. C++ had type coercion for a
very long time (can we call C++ "modern language"?), many scripting
languages have type coercion, and even Java has introduced type coercion for
its native types and wrappers (what's called autoboxing and unboxing in
java-dom).

- Type inference

A variant of static typing where types are left out undeclared or as type
parameters. The actual types are completed on the use of the code, filling
out the blanks with the types needed.
This is a common feature of functional languages, and is also known to be
used in imperative languages with what's called "generic programming".

- Generic programming (since it's mentioned before but it really doesn't
matter to this debate)

Also known as "template programming".
Generic programming is a programming technique where code its
programatically generated on demand from code templates, where templates
parameters are filled in from the needs of the actual code.
It can be supported by the language or implemented as an external
precompiler (or both).
PHP inadvertently supports some form of generic programming.
It's usually confused with parametric typing. Although generic programming
may allow some form of parametric typing, it's not the same concept and
should not be confused.

- Parametric typing (doesn't matter either, but mentioned before, so for
clarification)

Parametric typing allows the definition of parametrized types.
Parametrized types are types that are defined on other types.
The most known parametrized type in computer science is the array (not the
array from dynamic-typing-dom at least). The computer science "canon"
considers the array to be parametrized to the type of value it can hold, so
we have an array of ints, an array of strings, an array of X, etc, etc.
Since dynamic typing bounds the type to the value, arrays in dynamic typing
can hold any type, so it's not, strictly speaking, a parametrized type.

- Typing strength

Typing strength is the measurement of how much the language forces you to
work with types. How this measurement is done is a matter of large
discussion.

- Strong typing

There's a general acceptance that "strong typing" refers to the use of
static typing and the lack of type coercion and type inference.
A more radical view on "strong typing" dictates that type conversion should
be completely disallowed. We can refer to those conditions as "very strong
typing".

- Weak typing

Weak typing is the opposite of strong typing.
Generally speaking, if a language has dynamic typing, type coercion, or type
inference, it's usually considered to be weakly typed. The degree of this
weakness varies with how much it allows to circumvent the use of types.

- Type safety

Type safety are the mechanisms provided to prevent errors caused by type
misuse, commonly referred as "type errors".

- Type signature

Type signature is the expected types on input and output data of an
algorithm.
It's generally used to refer to the types of the arguments and the return
value of a function or method.
The language must guarantee type safety by ensuring that the type signature
is obeyed.

- Type hinting

Also known as "type annotations", but discouraged since that terms can be
confused with the type declarations of static typing.
It's a relatively new concept, there's not much literature on the subject.
A source of definitions on type hinting are the ECMAScript4 proposals, were
this feature is referred to as "type annotations".
Type hinting is the optional use of static typing checks in a dynamic typing
environment.
It's meant to provide an optional increase of type safety, to those who are
used to static typing, in a scripting environment that traditionally uses
dynamic typing.

* Performance considerations on type signature vs type hinting

These considerations are only relevant on a dynamic typing environment.
Type signature only demands checks to be done at entry point and exit point,
so less checks are usually done.
Type hinting demands that the type is checked wherever necessary to assure
the type constraint is followed, as if static typing was used. Therefore
type hinting is much more time consuming due to type checks. Nevertheless,
type checks can be avoided if enough type hints are known at compile time,
and, therefore, type checks can be avoided. Moreover, when these conditions
are met, the compiler could theoretically profit from the performance
benefits inherent to static typing.

---- Now, some random facts ----

% You should have guessed by now the irony of C being strongly typed and C++
being weakly typed. Herein lies most of the reasons for discussion on what
the hell is wrong with the scale that measures the typing strength... (I
believe is not the scale, just the fact that it seems hard to recognize C++
as completely different language)

% Yeap, strictly speaking, PHP doesn't support type hinting, just type
signature.

% PHP uses type coercion for native types, but silently ignores type
conversion errors = random WTF

% If you want to type signature with native types in PHP, you'll have to do
type coercion (the array type hint should do it too). If not, you're just
doing strong typing demands on an strong-typing-unfriendly environment.

% PHP type coercion + type coercion on type signatures = popcorn WTF

% Type coercion on type signature and passing by reference do not coexist
nicely. A challenge for the daring to solve.

% A full implementation of type hinting is much of a challenge and better
left to a major version release. Some opcode mojo and backpatching voodoo
should be concocted to get the juice out of this fruit.

----

I might sound like a broken record by now, but, sorry for my never-ending
intrusions.
I'm just too much of a geek to resist the temptation of meddling into a CS
related debate.

Best Regards,
Your friendly neighborhood wannabe game developer,
Martin Alterisio

----- End of forwarded message from Martin Alterisio -----

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to