On Friday, 10 August 2012 at 22:01:46 UTC, Walter Bright wrote:
It catches only a subset of these at compile time. I can craft
any number of ways of getting it to miss diagnosing it.
Consider this one:
float z;
if (condition1)
z = 5;
... lotsa code ...
if (condition2)
z++;
To diagnose this correctly, the static analyzer would have to
determine that condition1 produces the same result as
condition2, or not. This is impossible to prove. So the static
analyzer either gives up and lets it pass, or issues an
incorrect diagnostic. So our intrepid programmer is forced to
write:
float z = 0;
if (condition1)
z = 5;
... lotsa code ...
if (condition2)
z++;
Now, as it may turn out, for your algorithm the value "0" is an
out-of-range, incorrect value. Not a problem as it is a dead
assignment, right?
But then the maintenance programmer comes along and changes
condition1 so it is not always the same as condition2, and now
the z++ sees the invalid "0" value sometimes, and a silent bug
is introduced.
This bug will not remain undetected with the default NaN
initialization.
The compiler in languages like C# doesn't try to prove that the
variable is NOT set and then emits an error. It tries to prove
that the variable IS set, and if it can't prove that, it's an
error.
It's not an incorrect diagnostic, it does exactly what it's
supposed to do and the programmer has to be explicit when one
takes on the responsibility of initialization. I don't see
anybody complaining about this feature in C#, most experienced C#
programmers I've talked to love it (I much prefer it too).
Leaving a local variable initially uninitialized (or rather, not
explicitly initialized) is a good way to portray the intention
that it's going to be conditionally initialized later. In C#, if
your program compiles, your variable is guaranteed to be
initialized later but before use. This is a useful guarantee when
reading/maintaining code.
In D, on the other hand, it's possible to write D code like:
for(size_t i; i < length; ++i)
{
...
}
And I've actually seen this kind of code a lot in the wild. It
boggles my mind that you think that this code should be legal. I
think it's lazy - the intention is not clear. Is the default
initializer being intentionally relied on, or was it
unintentional? I've seen both cases. The for-loop example is an
extreme one for demonstrative purposes, most examples are less
obvious.
Saying that most programmers will explicitly initialize floating
point numbers to 0 instead of NaN when taking on initialization
responsibility is a cop-out - float.init and float.nan are
obviously the values you should be going for. The benefit is easy
for programmers to understand, especially if they already
understand why float.init is NaN. You say yelling at them
probably won't help - why not? I personally use
float.init/double.init etc. in my own code, and I'm sure other
informed programmers do too. I can understand why people don't do
it in, say, C, with NaN being less defined there afaik. D
promotes NaN actively and programmers should be eager to leverage
NaN explicitly too.
It's also important to note that C# works the same as D for
non-local variables - they all have a defined default initializer
(the C# equivalent of T.init is default(T)). Another point is
that the local-variable analysis is limited to the scope of a
single function body, it does not do inter-procedural analysis.
I think this would be a great thing for D, and I believe that all
code this change breaks is actually broken to begin with.