On Tuesday, 5 August 2014 at 21:17:14 UTC, H. S. Teoh via Digitalmars-d wrote:
On Tue, Aug 05, 2014 at 08:11:16PM +0000, via Digitalmars-d wrote:
On Tuesday, 5 August 2014 at 18:57:40 UTC, H. S. Teoh via Digitalmars-d
wrote:
>Exactly. I think part of the problem is that people have been >using >assert with the wrong meaning. In my mind, 'assert(x)' >doesn't mean >"abort if x is false in debug mode but silently ignore in >release >mode", as some people apparently think it means. To me, it >means "at
>this point in the program, x is true".  It's that simple.

A language construct with such a meaning is useless as a safety
feature.

I don't see it as a safety feature at all.

Sorry, I should have written "correctness feature". I agree that it's not very useful for safety per se. (But of course, safety and correctness are not unrelated.)



If I first have to prove that the condition is true before I can safely use an assert, I don't need the assert anymore, because I've
already proved it.

I see it as future proofing: I may have proven the condition for *this* version of the program, but all software will change (unless it's dead), and change means the original proof may no longer be valid, but this
part of the code is still written under the assumption that the
condition holds. In most cases, it *does* still hold, so in general
you're OK, but sometimes a change invalidates an axiom that, in
consequence, invalidates the assertion. Then the assertion will trip (in non-release mode, of course), telling me that my program logic has become invalid due to the change I made. So I'll have to fix the
problem so that the condition holds again.

Well, I think it's unlikely that you actually did prove the assert condition, except in trivial situations. This is related to the discussion about the ranges example, so I'll respond there.



If it is intended to be an optimization hint, it should be implemented as a pragma, not as a prominent feature meant to be widely used. (But I see that you have a different use case, see my comment below.)

And here is the beauty of the idea: rather than polluting my code with optimization hints, which are out-of-band (and which are generally unverified and may be outright wrong after the code undergoes several revisions), I am stating *facts* about my program logic that must hold -- which therefore fits in very logically with the code itself. It even self-documents the code, to some extent. Then as an added benefit, the compiler is able to use these facts to emit more efficient code. So to me, it *should* be a prominent, widely-used feature. I would use it, and
use it a *lot*.

I think this is where we disagree mainly: What you call facts is something I see as intentions that *should* be true, but are not *proven* to be so. Again, see below.



>The optimizer only guarantees (in theory) consistent program
>behaviour if the program is valid to begin with. If the >program is >invalid, all bets are off as to what its "optimized" version >does.

There is a difference between invalid and undefined: A program is invalid ("buggy"), if it doesn't do what it's programmer intended, while "undefined" is a matter of the language specification. The (wrong) behaviour of an invalid program need not be undefined, and
often isn't in practice.

To me, this distinction doesn't matter in practice, because in practice, an invalid program produces a wrong result, and a program with undefined behaviour also produces a wrong result. I don't care what kind of wrong result it is; what I care is to fix the program to *not* produce a wrong
result.

Please see my response to Jeremy; the distinction is important:
http://forum.dlang.org/thread/hqxoldeyugkazolll...@forum.dlang.org?page=11#post-eqlyruvwmzbpemvnrebw:40forum.dlang.org



An optimizer may only transform code in a way that keeps the resulting
code semantically equivalent. This means that if the original
"unoptimized" program is well-defined, the optimized one will be too.

That's a nice property to have, but again, if my program produces a wrong result, then my program produces a wrong result. As a language user, I don't care that the optimizer may change one wrong result to a different wrong result. What I care about is to fix the code so that the program produces the *correct* result. To me, it only matters that the optimizer does the Right Thing when the program is correct to begin with. If the program was wrong, then it doesn't matter if the optimizer makes it a different kind of wrong; the program should be fixed so that
it stops being wrong.

We're not living in an ideal world, unfortunately. It is bad enough that programs are wrong as they are written, we don't need the compiler to transform these programs to do something that is still wrong, but also completely different. This would make your goal of fixing the program very hard to achieve. In an extreme case, a small error in several million lines of code could manifest at a completely different place, because you cannot rely on any determinism once undefined behaviour is involved.



>Yes, the people using assert as a kind of "check in debug >mode but
>ignore in release mode" should really be using something else
>instead, since that's not what assert means. I'm honestly >astounded
>that people would actually use assert as some kind of
>non-release-mode-check instead of the statement of truth that >it was
>meant to be.

Well, when this "something else" is introduced, it will need to
replace almost every existing instance of "assert", as the latter must only be used if it is proven that the condition is always true. To
name just one example, it cannot be used in range `front` and
`popFront` methods to assert that the range is not empty, unless there
is an additional non-assert check directly before it.

I don't follow this reasoning. For .front and .popFront to assert that the range is non-empty, simply means that user code that attempts to do otherwise is wrong by definition, and must be fixed. I don't care if it's wrong as in invalid, or wrong as in undefined, the bottom line is
that code that calls .front or .popFront on an empty range is
incorrectly written, and therefore must be fixed.

Just above you wrote that you "may have proven the condition". But in code like the following, there cannot be a proof:

    @property T front() {
        assert(!empty);
        return _other_range.front;
    }

This is in the standard library. The authors of this piece of code cannot have proven that the user of the library only calls `front` on a non-empty range. Now consider the following example (mostly made up, but not unrealistic) that parses a text file (this could be a simple text-based data format):

    // ...
    // some processing
    // ...
    input.popFront();
    // end of line? => swallow and process next line
    if(input.front == '\n') { // <- this is wrong
        input.popFront();
        continue;
    }
    // ...
    // more code that doesn't call `input.popFront`
    // ...
    // more processing of input
    if(!input.empty) {    // <- HERE
        // use input.front
    }

With the above definition of `front`, the second check marked "HERE" can be removed by the compiler. Even worse, if you insert `writeln(input.empty)` before the check for debugging, it might also output "false" (depending on how far the compiler goes).

Yes, this code is wrong. But it's an easy mistake to make, it might not be detected during testing because you only use correctly formatted input files, and it might also not lead to crashes (the buffer is unlikely to end at a boundary to unmapped memory).

Now the assert - which is supposed to be helping the programmer write correct code - has made it _harder_ to detect the cause of an error.

What's worse is that it also removed a check that was necessary. This check could have been inserted by the programmer because the section of the code is security relevant, and they didn't want to rely on the input file to be correct. The compiler has thereby turned a rather harmless mistake that would under normal circumstances only lead to an incorrect output into a potentially exploitable security bug.


-- snip --
But if I've convinced myself that it is
correct, then I might as well disable the emptiness checks so that my product will deliver top performance -- since that wouldn't be a problem
in a correct program.

The problem is, as I explained above, that it doesn't just disable the emptiness checks where the asserts are. A simple mistake can have subtle and hard to debug effects all over your program.

In theory, the optimizer could use CTFE to reduce the function call, and thereby discover that the code is invalid. We don't have that today, but
conceivably, we can achieve that one day.

But taking a step back, there's only so much the compiler can do at compile-time. You can't stop *every* unsafe usage of something without also making it useless. While the manufacturer of a sharp cutting tool will presumably do their best to ensure the tool is safe to use, it's impossible to prevent *every* possible unsafe usage of said tool. If the user points the chainsaw to his foot, he will lose his foot, and there's nothing the manufacturer can do to prevent this except shipping a non-functional chainsaw. If the user insists on asserting things that are untrue, there will always be a way to bypass the compiler's static
checks and end up with undefined behaviour at runtime.

I wouldn't be so pessimistic ;-)

I guess most assert conditions are simple, mostly just comparisons or equality checks of one value with a constant. This should be relatively easy to verify with some control/data flow analysis (which Walter avoided until now, understandably).

But CTFE is on the wrong level. It could only detect some of the failed conditions. It needs to be checked on a higher lever, as real correctness proofs. If an assert conditions cannot be proved - because it's always wrong, or just sometimes, or because the knowledge available to the compiler is not enough - it must be rejected. Think of it like an extension of type and const checking.



It would be great if this were possible. In the example of `front` and `popFront`, programs that call these methods on a range that could theoretically be empty wouldn't compile. This might be useful for optimization, but above that it's useful for verifying correctness.

A sufficiently-aggressive optimizer might be able to verify this at compile-time by static analysis. But even that has its limits... for
example:

        MyRange range;
        assert(range.empty);
        if (solveRiemannHypothesis()) // <-- don't know if this is true
                range.addElements(...);

        range.popFront(); // <-- should this compile or not?

It shouldn't, because it's not provable. However, most asserts are far less involved. There could be a specification of what is guaranteed to work, and what all compilers must therefore support.



Unfortunately this is not what has been suggested (and was evidently
intended from the beginning)...

I don't think static analysis is *excluded* by the current proposal. I can see it as a possible future enhancement. But the fact that we don't
have it today doesn't mean we shouldn't step in that direction.

I just don't see how we're stepping into that direction at all. It seems like the opposite: instead of trying to prove the assertions statically, they're going to be believed without verification.

Reply via email to